Schema validation fails with whitespace preceding a QName attribute
To reproduce the issue run the command:
xmllint --schema http://www.xbrl.org/2013/inlineXBRL/xhtml-inlinexbrl-1_1.xsd http://www.xbrlquery.com/PASS-nonFraction-nesting-formats.html --noout
I see the errors below where none are expected:
http://www.xbrlquery.com/PASS-nonFraction-nesting-formats.html:38: element nonFraction: Schemas validity error : Element '{http://www.xbrl.org/2013/inlineXBRL}nonFraction', attribute 'format': The QName value ' ixt:numdotcomma' has no corresponding namespace declaration in scope.
http://www.xbrlquery.com/PASS-nonFraction-nesting-formats.html:38: element nonFraction: Schemas validity error : Element '{http://www.xbrl.org/2013/inlineXBRL}nonFraction', attribute 'format': ' ixt:numdotcomma' is not a valid value of the atomic type 'xs:QName'.
This command was run on https://colab.research.google.com/ after installing libxml2-utils which then reports the following xmllint --version information:
xmllint: using libxml version 20904
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ICU ISO8859X Unicode Regexps Automata Expr Schemas Schematron Modules Debug Zlib Lzma
The example used is from the conformance suite of a specification called Inline-XBRL (iXBRL). iXBRL is now used by financial regulators around the globe such the US (SEC), UK (Bank of England), EU (ESMA and EBA) and many others.
iXBRL defines an extended xhtml schema. This schema is used as the --schema parameter argument in the example. The errors occur because on line 38 of the file being validated there exists two instances of an attribute called 'format'. These attributes each contain a QName value and that value is surrounded by whitespace.
The definition of @format is in http://www.xbrl.org/2013/inlineXBRL/xhtml-inlinexbrl-1_1-definitions.xsd. It is defined to contain a QName. In turn, QName is defined in https://www.w3.org/2001/XMLSchema.xsd to allow whitespace and its restriction specifies whitespace-collapse.
Valid QName values for @format are defined elsewhere in the iXBRL schema and the ones used in @format of the example are correct. This can be verified because if the whitespace is removed, libxml schema validation succeeds.
libxml schema validation will also accommodate trailing whitespace but it objects if whitespace precedes the attribute's QName value. As you can see from the error messages included above, libxml is failing to interpret the QName values in @format as ones that are defined by the iXBRL schema. This appears to be because it is including the leading whitespace as part of the QName value to be validated. Again, if the leading whitespace is removed libxml successfully validates the QName values (showing the iXBRL schema is correct in this regard) but fails if the value includes leading whitespace.
I believe, and of course I may well be wrong, that libxml should not include the leading whitespace as part of the QName value used to find a defined QName but because it does, libxml is unable to find a corresponding valid QName.