xmllint handling of recursion
xmllint's handling of resolution of xi:include tags returns an error "detected a recursion" when it should allow the processing.
Background The specification for XML Inclusions (XInclude) Version 1.0 (Second Edition) addresses possible recursive cases in section 4.2.7 "Inclusion Loops". Specifically, the section states:
When recursively processing an xi:include element, it is a fatal error to process another xi:include element with an include location and xpointer attribute value that have already been processed in the inclusion chain. [Boldface added.]
The specification goes on to identify cases where it is illegal:
The following are illegal:
An xi:include element pointing to itself or any ancestor thereof, when parse="xml".
An xi:include element pointing to any include element or ancestor thereof which has already been processed at a higher level.
xmllint.c
performs a check at line 642 to see if the URI matches any previous URIs stored in the stack variable: urlTab
. If there is a match, then an error is flagged and no further processing occurs -- the <xi:include>
tag remains in the stream unless there is alternate handling through the optional tags.
Problem Case
Here is an example where an <xi:include>
is utilized is the same file in a legal manner and does not qualify as an illegal construct:
jlpoole@ares ~/work/xml $ cat -n red.xml
1 <?xml version="1.0" encoding="UTF-8"?>
2 <book xmlns:xi="http://www.w3.org/2001/XInclude"
3 xmlns:xlink="http://www.w3.org/1999/xlink">
4 <chapter>
5 <para xml:id="t100">Introduction</para>
6 </chapter>
7 <chapter>
8 <xi:include href="./red.xml" xpointer="t100" parse="xml"/>
9 </chapter>
10 </book>
jlpoole@ares ~/work/xml $ xmllint --xinclude --output full.xml red.xml
red.xml:8: element include: XInclude error : detected a recursion in red.xml
jlpoole@ares ~/work/xml $ cat -n full.xml
1 <?xml version="1.0" encoding="UTF-8"?>
2 <book xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xlink="http://www.w3.org/1999/xlink">
3 <chapter>
4 <para id="t100">Introduction</para>
5 </chapter>
6 <chapter>
7 <xi:include href="./red.xml" xpointer="t100" parse="xml"/>
8 </chapter>
9 </book>
jlpoole@ares ~/work/xml $
In the above red.xml
, the <xi:include>
at line 8 is not a descendant of the <para>
element in line 5, hence the line 8 <xi:include>
should be processed and not rejected because it's href
value values matches an entry, e.g. the current file's, in the urlTab
stack.
What should appear in full.xml
is the following:
1 <?xml version="1.0" encoding="UTF-8"?>
2 <book xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:xlink="http://www.w3.org/1999/xlink">
3 <chapter>
4 <para xml:id="t100">Introduction</para>
5 </chapter>
6 <chapter>
7 <para xml:id="t100">Introduction</para>
8 </chapter>
9 </book>
Note: The specification xml:id Version 1.0 provides "An xml:id processor should assure that the following constraint holds:
- The values of all attributes of type “ID” (which includes all xml:id attributes) within a document are unique.
I'm allowing for the directive to not be followed as I want to reuse content. In the example above, the presence of two <para>
elements with the same id value does not conform to the "should" be unique directive.
I am wondering if xmllint.c
(3/2/22 edit: should be xinclude.c
) might be modified and want to make sure that it is agreed the above sample file red.xml
is legal. It seems to me that a comparison beyond just file name is warranted, although I am not certain how that would be accomplished. I want to get some opinions by the project owners as to this issue before undertaking a modification.