libxml2 issueshttps://gitlab.gnome.org/GNOME/libxml2/-/issues2024-03-25T23:00:31Zhttps://gitlab.gnome.org/GNOME/libxml2/-/issues/706Add an option for postfix of lib name for msvc.2024-03-25T23:00:31ZGNOME Gitlab AutomationAdd an option for postfix of lib name for msvc.
The following Merge Request (MR) has been forwarded from GitHub in order to prevent
the GNOME Project from losing contributions coming from un-official channels. And for
contributors to not see their valuable contributions not being acc...
The following Merge Request (MR) has been forwarded from GitHub in order to prevent
the GNOME Project from losing contributions coming from un-official channels. And for
contributors to not see their valuable contributions not being accounted for.
Relevant information:
Github handle: TheBetterSolution
MR URL: https://github.com/GNOME/libxml2/pull/33
Patch URL: https://github.com/GNOME/libxml2/pull/33.patch
Body of the MR:
I want to use a uniform library name for debug/release/shared/static, because our product only build with one of the possible options, that can manage the library easily.
Could you please add a new option of cmake to control whether to append the postfix to the library name?https://gitlab.gnome.org/GNOME/libxml2/-/issues/701Improve fuzz coverage2024-03-15T19:35:42ZNick WellnhoferImprove fuzz coverageThis is a tracking issue for improving fuzz coverage.
## Fuzzing with virtual machines
To fuzz the countless API entry points, the idea is to implement simple virtual machines which execute fuzz data as programs, mostly mapping opcodes...This is a tracking issue for improving fuzz coverage.
## Fuzzing with virtual machines
To fuzz the countless API entry points, the idea is to implement simple virtual machines which execute fuzz data as programs, mostly mapping opcodes to function calls. A set of registers is provided for each argument and return type. This makes it possible to simulate all kinds of API usage patterns.
### Core node operations
This is currently being worked on by @nwellnhof.
Includes most API functions in
- tree.h
- valid.h
- entities.h
- HTMLtree.h
### Needing improvements
- xmlreader.h
- xpath.h
### Completely uncovered APIs
- c14n.h
- catalog.h
- encoding.h
- pattern.h
- xmlsave.h
## Unimportant modules
- relaxng.h
- schematron.h
- xmlwriter.hhttps://gitlab.gnome.org/GNOME/libxml2/-/issues/700false negative validation: error '1835': 0.04 must be greater than 02024-03-25T11:45:24ZCarsten Ffalse negative validation: error '1835': 0.04 must be greater than 0Hello,
i use libxml2 to validate an xml file against a xsd schema and face this issue:
![Anmerkung_2024-03-07_074329](/uploads/226c90b9979a0f5a35ad16d38a265c7b/Anmerkung_2024-03-07_074329.png)
As you can see the error message is simply...Hello,
i use libxml2 to validate an xml file against a xsd schema and face this issue:
![Anmerkung_2024-03-07_074329](/uploads/226c90b9979a0f5a35ad16d38a265c7b/Anmerkung_2024-03-07_074329.png)
As you can see the error message is simply wrong. 0.04 is obvious greater than 0. Using xmllint from terminal with the exact same xml and xsd file does not find any issues with the validation. Howerever, using libxml2 within my c++ code does.
As a work around it is possible to change the type float to decimal in the xsd file. But this is not an option for me as the xsd is part of an bigger project where the xsd is autogenerated and also used for code generation. Changing float to decimal would come with hugh extra work.
Here is the xml file
```
<tns:Root xmlns:tns="XXX" xmlns:xsi="XXX">
<Class id="1">
<Value>0.04</Value>
</Class>
</tns:Root>
```
and here is the xsdfile:
```
<?xml version="1.0" encoding="UTF-8"?>
<schema xmlns="http://www.w3.org/2001/XMLSchema" xmlns:tns="XXX" xmlns:p="XXX" xmlns:p1="XXX" targetNamespace="XXX">
<element name="Root">
<complexType>
<sequence>
<element name="Class" type="tns:ClassType" maxOccurs="unbounded" minOccurs="0">
<annotation>
<documentation/>
</annotation>
</element>
</sequence>
</complexType>
</element>
<complexType name="ClassType">
<sequence>
<element name="Value" minOccurs="0" maxOccurs="1">
<annotation>
<documentation>blabla</documentation>
<appinfo>
<p:default>0.040</p:default>
</appinfo>
</annotation>
<simpleType>
<restriction base="float">
<minExclusive value="0"/>
<maxExclusive value="1"/>
</restriction>
</simpleType>
</element>
</sequence>
<attribute name="id" type="int" use="required"/>
</complexType>
</schema>
```
my c++ code using libxml2 is here:
[xmlvalidator.h](/uploads/d98a9ce18c2dd4d4c258c84be022b76d/xmlvalidator.h)
[xmlvalidator.cpp](/uploads/06375a5c033f092a9366ebed6f4b15ae/xmlvalidator.cpp)
Is this an issue of libxml2 or is the issue with my code?
i use libxml2.9.13 but also build from source version 2.12.13 the issue is with both versions.https://gitlab.gnome.org/GNOME/libxml2/-/issues/698xmllint: Exit code 0 despite namespace error2024-03-01T13:49:52ZChristian Weiskexmllint: Exit code 0 despite namespace errorI want to use `xmllint` to validate XML files in a pre-commit hook. Unfortunately, namespace errors are reported but do still give an exit code of `0`:
```
$ echo '<f:a/>' | xmllint --noout -; echo $?
-:1: namespace error : Namespace pre...I want to use `xmllint` to validate XML files in a pre-commit hook. Unfortunately, namespace errors are reported but do still give an exit code of `0`:
```
$ echo '<f:a/>' | xmllint --noout -; echo $?
-:1: namespace error : Namespace prefix f on a is not defined
<f:a/>
^
0
```
I did not see any option in the man page that would change the exit code to non-zero if a namespace error occurs.
Please add such an option, or default to changing the exit code on namespace errors.
----
- OS: Debian GNU/Linux 12 (bookworm)
- Package: libxml2-utils 2.9.14+dfsg-1.3~deb12u1
```
$ xmllint --version
xmllint: using libxml version 20914
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ICU ISO8859X Unicode Regexps Automata Schemas Schematron Modules Debug Zlib Lzma
```
----
Related: https://mail.gnome.org/archives/xml/2021-June/msg00000.htmlhttps://gitlab.gnome.org/GNOME/libxml2/-/issues/694testModule fails with meson: testdso shared lib path depends on the build system2024-02-21T13:12:25ZvtorritestModule fails with meson: testdso shared lib path depends on the build systemWhen I try meson build on linux, `testModule` test fails because the testdso shared lib is searched in `MODULE_PATH` defined to `.libs` (because of the autotools I guess).
With meson, it is located elsewhere and the test fails.
One pos...When I try meson build on linux, `testModule` test fails because the testdso shared lib is searched in `MODULE_PATH` defined to `.libs` (because of the autotools I guess).
With meson, it is located elsewhere and the test fails.
One possible solution for this issue : defining `MODULE_PATH` in `config.h` and include `config.h` in testModule.c.
Or passing the path as argument of the test, then retrieved in argv
maybe there are better solutionshttps://gitlab.gnome.org/GNOME/libxml2/-/issues/691Fixed/default attributes from XSD are saved as normal attributes2024-03-07T11:08:53ZDavide CapodaglioFixed/default attributes from XSD are saved as normal attributesWhen using fixed/default attributes in XSD validation, to “apply” to an xml document fixed attributes only defined in the schema but not in the real xml content it is required to call
xmlSchemaSetValidOptions(ctx, XML_SCHEMA_VAL_VC_I_CR...When using fixed/default attributes in XSD validation, to “apply” to an xml document fixed attributes only defined in the schema but not in the real xml content it is required to call
xmlSchemaSetValidOptions(ctx, XML_SCHEMA_VAL_VC_I_CREATE);
however when later re-saving the xml to disk after some modification all the fixed attributes are saved as well…
An internal “specified” property on each attribute should be added (default and fixed attribute must have specified=false), so only the really “explicit” attributes are really saved to disk.
See also issue #284https://gitlab.gnome.org/GNOME/libxml2/-/issues/674[html serializer] Top level domains in link href attributes are incorrectly u...2024-01-31T12:10:48ZTimo Brembeck[html serializer] Top level domains in link href attributes are incorrectly urlencodedWhen using the HTML serializer, all link attributes are urlencoded. This makes sense for the URL path, however not so much for the domain part.
Domains may only contain ascii characters. Some registrars allow punycode representations for...When using the HTML serializer, all link attributes are urlencoded. This makes sense for the URL path, however not so much for the domain part.
Domains may only contain ascii characters. Some registrars allow punycode representations for unicode characters in domain names, but no other representations ([RFC 5895](https://datatracker.ietf.org/doc/html/rfc5895)).
Most browsers will probably open links like https://www.baf%C3%B6g.de without issues, however, other libraries might not be that forgiving, e.g. python's requests library will throw an error.
To reproduce this problem:
`xmllint --html <(echo "<a href='https://www.bafög.de'>https://www.bafög.de</a>")`
```
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body>
<a href="https://www.baf%C3%83%C2%B6g.de">https://www.bafög.de</a>
</body></html>
```
This doesn't change when unicode encoding is used:
`xmllint --html --encode utf-8 <(echo "<a href='https://www.bafög.de'>https://www.bafög.de</a>")`
```
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" "http://www.w3.org/TR/REC-html40/loose.dtd">
<html><body><a href="https://www.baf%C3%83%C2%B6g.de">https://www.bafög.de</a>
</body></html>
```
In those examples, the urlencoding is even a bit more broken than when using as part of lxml (`https://www.baf%C3%83%C2%B6g.de` decodes to `https://www.bafög.de`), but maybe that's due to the usage via cli.
In my opinion, libxml2 should either parse the URL correctly and exclude the domain part from the urlencoding (or applies punycode encoding) like it does for the protocol, or provide an option to turn off urlencoding of links altogether.
I'm sorry if I missed something and there already exists a similar ticket or it isn't a valid bug for other reasons.
For reference, here my ticket I initially opened on lxml's side: https://bugs.launchpad.net/lxml/+bug/2051597
And an issue from my project where this problem arose: https://github.com/digitalfabrik/integreat-cms/issues/2274https://gitlab.gnome.org/GNOME/libxml2/-/issues/666Quadratic behavior in XPath translate() function2024-01-19T14:47:04ZNick WellnhoferQuadratic behavior in XPath translate() functionLooking up characters in `xmlXPathTranslateFunction` is slow with large translation strings.Looking up characters in `xmlXPathTranslateFunction` is slow with large translation strings.https://gitlab.gnome.org/GNOME/libxml2/-/issues/665UPA and extensions2024-01-19T10:56:41ZRobby SimpsonUPA and extensionsIf I create an XML schema (XSD) that includes a complexType that extends another complexType resulting in a violation of unique particle attribution (UPA) it still validates using libxml / xmllint but shows as invalid with other librarie...If I create an XML schema (XSD) that includes a complexType that extends another complexType resulting in a violation of unique particle attribution (UPA) it still validates using libxml / xmllint but shows as invalid with other libraries such as Xerces.
I realize this use case may seem somewhat contrived, but it is based on a real-world standards applications attempting to follow [these extensibility recommendations](https://www.xml.com/pub/a/2004/10/27/extend.html) (All new components in existing or new namespace(s) for each compatible version (# 3)) to allow for future versions of the standardized XSD.
Here is a simplified XSD and an example XML demonstrating the issue:
XSD:
```
<?xml version='1.0' encoding='UTF-8'?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="http://example.com" targetNamespace="http://example.com" elementFormDefault="qualified" attributeFormDefault="unqualified" version="2.2">
<xs:complexType name="Child">
<xs:complexContent>
<xs:extension base="Parent">
<xs:sequence>
<xs:element name="Foo" type="xs:string" minOccurs="0" maxOccurs="1"/>
<xs:element name="r2_3" type="Revision2_3Type" minOccurs="0" maxOccurs="1"/>
<xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:anyAttribute processContents="lax"/>
</xs:extension>
</xs:complexContent>
</xs:complexType>
<xs:complexType name="Parent">
<xs:sequence>
<xs:element name="r2_3" type="Revision2_3Type" minOccurs="0" maxOccurs="1"/>
<xs:any namespace="##other" processContents="lax" minOccurs="0" maxOccurs="unbounded"/>
</xs:sequence>
<xs:anyAttribute processContents="lax"/>
</xs:complexType>
<xs:complexType name="Revision2_3Type">
<xs:sequence>
<xs:any processContents="lax" minOccurs="1" maxOccurs="unbounded" namespace="##targetNamespace"/>
</xs:sequence>
<xs:anyAttribute processContents="lax"/>
</xs:complexType>
<xs:element name="Child" type="Child" />
</xs:schema>
```
XML:
```
<Child xmlns="http://example.com">
<r2_3>
<test>bar</test>
</r2_3>
</Child>
```https://gitlab.gnome.org/GNOME/libxml2/-/issues/662python libraries in cmake build do not install in Python_SITEARCH2024-01-16T14:57:16Zheitbaumpython libraries in cmake build do not install in Python_SITEARCHWe have just migrated to using cmake instead of configure and had a "regression." We revert the following commit - and now cmake is equivalent of configure in our use case. In commit 02e12371964ed10c2c84ebb49760bc11b34913e1 the change wa...We have just migrated to using cmake instead of configure and had a "regression." We revert the following commit - and now cmake is equivalent of configure in our use case. In commit 02e12371964ed10c2c84ebb49760bc11b34913e1 the change was made to not install in ${Python_SITEARCH} but install in ${CMAKE_INSTALL_PREFIX}/python. Whilst I probably understand why this was done; is it possible to have a cmake variable like LIBXML2_USE_PYTHON_SITEARCH=ON to swap between the directories?
https://github.com/LibreELEC/LibreELEC.tv/pull/8523https://gitlab.gnome.org/GNOME/libxml2/-/issues/660Improve XPath error messages2024-01-10T17:35:26ZNick WellnhoferImprove XPath error messagesCommit 954b8984 removed some extra XPath error messages which were only reported to the global error handler. In a few cases, these error messages contained helpful information which is lost now. For example
- Unregistered function: whi...Commit 954b8984 removed some extra XPath error messages which were only reported to the global error handler. In a few cases, these error messages contained helpful information which is lost now. For example
- Unregistered function: which function?
- Undefined variable: which variable?
- Undefined namespace prefix: which prefix?
We should add an extra string argument to `xmlXPathErr` which is added to error messages and structured errors.https://gitlab.gnome.org/GNOME/libxml2/-/issues/659Quadratic behavior with abusive ATTLISTs2024-01-10T16:46:15ZNick WellnhoferQuadratic behavior with abusive ATTLISTsProcessing ATTLISTs with long names and many attributes has quadratic behavior. From a quick look, this is because we store attribute declarations in a single hash table, indexed by element name plus attribute name. Possible fixes:
- St...Processing ATTLISTs with long names and many attributes has quadratic behavior. From a quick look, this is because we store attribute declarations in a single hash table, indexed by element name plus attribute name. Possible fixes:
- Store attribute declarations in a second-level hash table. Requires rewriting a lot of the DTD validation code.
- Severely limit length of element names in ATTLISTs.
- Use our existing amplification checks when processing ATTLISTs.
The last approach looks like the most attractive.https://gitlab.gnome.org/GNOME/libxml2/-/issues/658CI: Python tests fail on MinGW2024-02-13T13:02:35ZNick WellnhoferCI: Python tests fail on MinGWPython tests suddenly fail with `ModuleNotFoundError: No module named 'libxml2'`. The MinGW Python version and everything else seems unchanged:
- Success: https://gitlab.gnome.org/GNOME/libxml2/-/jobs/3422216
- Failure: https://gitlab.g...Python tests suddenly fail with `ModuleNotFoundError: No module named 'libxml2'`. The MinGW Python version and everything else seems unchanged:
- Success: https://gitlab.gnome.org/GNOME/libxml2/-/jobs/3422216
- Failure: https://gitlab.gnome.org/GNOME/libxml2/-/jobs/3429790
This is probably related to the `os.add_dll_directory` call required for newer Python versions.https://gitlab.gnome.org/GNOME/libxml2/-/issues/656Quadratic behavior with namespaces in `xmlDocCopyNode`2024-01-03T14:41:59ZNick WellnhoferQuadratic behavior with namespaces in `xmlDocCopyNode``xmlDocCopyNode` uses `xmlSearchNs` to lookup namespaces. The latter iterates all namespace declarations of ancestor nodes, leading to quadratic behavior in trees with deep nesting and many namespace declarations. Copying of namespaces a...`xmlDocCopyNode` uses `xmlSearchNs` to lookup namespaces. The latter iterates all namespace declarations of ancestor nodes, leading to quadratic behavior in trees with deep nesting and many namespace declarations. Copying of namespaces also seems more convoluted than necessary. I think we only need a hash table that stores whether a namespace was declared inside the copied subtree.https://gitlab.gnome.org/GNOME/libxml2/-/issues/631Drop the HTTP and FTP clients2023-11-27T11:01:17ZDemi ObenourDrop the HTTP and FTP clientsThere’s no good reason to be using them nowadays. Both protocols are insecure and deprecated, and schemas and DTDs should be provided locally rather than fetched over a network.There’s no good reason to be using them nowadays. Both protocols are insecure and deprecated, and schemas and DTDs should be provided locally rather than fetched over a network.https://gitlab.gnome.org/GNOME/libxml2/-/issues/630Make single CRs increment line number2023-11-26T15:07:06ZNick WellnhoferMake single CRs increment line numberCurrently, only LFs and CRLFs increment line numbers which seems inconsistent. This should be fixed by changing the `NEXTL` macro, also allowing to fix the unexpected side effect of `xmlCurrentChar` skipping CRs.Currently, only LFs and CRLFs increment line numbers which seems inconsistent. This should be fixed by changing the `NEXTL` macro, also allowing to fix the unexpected side effect of `xmlCurrentChar` skipping CRs.https://gitlab.gnome.org/GNOME/libxml2/-/issues/613xmlReconciliateNs shifts namespace if a xmlns:default namespace exists2023-12-27T13:22:04ZNiels DosschexmlReconciliateNs shifts namespace if a xmlns:default namespace existsThe following code:
```c
#include <string.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
int main() {
const char *input =
"<?xml version=\"1.0\"?>"
"<A xmlns=\"urn:A\">"
"<B>"
"<C xmlns...The following code:
```c
#include <string.h>
#include <libxml/parser.h>
#include <libxml/tree.h>
int main() {
const char *input =
"<?xml version=\"1.0\"?>"
"<A xmlns=\"urn:A\">"
"<B>"
"<C xmlns=\"urn:C\" xmlns:default=\"urn:other\"/>"
"</B>"
"</A>";
xmlDocPtr doc = xmlReadMemory(input, strlen(input), NULL, NULL, 0);
if (doc == NULL) {
abort();
}
xmlNodePtr root = xmlDocGetRootElement(doc);
// Make a copy, and reconciliate the namespaces
xmlDocPtr doc2 = xmlNewDoc(NULL);
xmlNodePtr B_copy = xmlDocCopyNode(root->children /* <B> */, doc2, 1);
xmlAddChild((xmlNodePtr) doc2, B_copy);
xmlReconciliateNs(doc2, B_copy);
xmlSaveFormatFileEnc("-", doc2, "UTF-8", 1);
xmlFreeDoc(doc);
xmlFreeDoc(doc2);
return 0;
}
```
Results in this output:
```
<?xml version="1.0" encoding="UTF-8"?>
<B xmlns="urn:A" xmlns:default="urn:C">
<default:C xmlns="urn:C" xmlns:default="urn:other"/>
</B>
```
Notice how the C element is now in the `urn:other` namespace instead of `urn:C` (the xmlns:default namespace is redeclared on C).
I expected C to still be on the `urn:C` namespace instead like in the original document.
I.e. I expected this output (or something equivalent):
```
<?xml version="1.0" encoding="UTF-8"?>
<B xmlns="urn:A">
<C xmlns="urn:C"/>
</B>
```https://gitlab.gnome.org/GNOME/libxml2/-/issues/612Minor cleanups in tree.c2023-11-02T23:00:36ZGNOME Gitlab AutomationMinor cleanups in tree.c
The following Merge Request (MR) has been forwarded from GitHub in order to prevent
the GNOME Project from losing contributions coming from un-official channels. And for
contributors to not see their valuable contributions not being acc...
The following Merge Request (MR) has been forwarded from GitHub in order to prevent
the GNOME Project from losing contributions coming from un-official channels. And for
contributors to not see their valuable contributions not being accounted for.
Relevant information:
Github handle: nielsdos
MR URL: https://github.com/GNOME/libxml2/pull/31
Patch URL: https://github.com/GNOME/libxml2/pull/31.patch
Body of the MR:
Minor cleanups in tree.c as found by Stack static analyser.
Submitting via GitHub because I don't have permissions to fork on GNOME's GitLab, and I also don't have permissions to send a merge request by email.https://gitlab.gnome.org/GNOME/libxml2/-/issues/611URI wrong for fully qualified path on Windows2023-12-26T15:29:02ZNiels DosscheURI wrong for fully qualified path on WindowsThe following code:
```
xmlDocPtr doc = xmlReadFile("C:\\does_not_exist.xml", NULL, 0);
printf("%s\n", xmlGetLastError()->str1);
```
will output: `file:/C:/does_not_exist.xml`. While I expected to see file:// or file:/// even, as file:/...The following code:
```
xmlDocPtr doc = xmlReadFile("C:\\does_not_exist.xml", NULL, 0);
printf("%s\n", xmlGetLastError()->str1);
```
will output: `file:/C:/does_not_exist.xml`. While I expected to see file:// or file:/// even, as file:/ isn't valid.
Linux behaves correctly with fully qualified paths.
Similarly, even if the load succeeds, the URI in `doc->URL` is wrong.https://gitlab.gnome.org/GNOME/libxml2/-/issues/609(innermost/terminating) indents in xmllint not enough2023-10-22T20:45:20Zsimon place(innermost/terminating) indents in xmllint not enough```
simon@fedora:~/Documents$ xmllint --format a1.xml
<?xml version="1.0"?>
<a>
<a>
<a>
<a>
</a>
</a>
</a>
</a>
simon@fedora:~/Documents$ cat a1.xml
<?xml version="1.0"?>
<a>
<a>
<a>
<a>
</a>
</a>
</a>
</a>
simon@fed...```
simon@fedora:~/Documents$ xmllint --format a1.xml
<?xml version="1.0"?>
<a>
<a>
<a>
<a>
</a>
</a>
</a>
</a>
simon@fedora:~/Documents$ cat a1.xml
<?xml version="1.0"?>
<a>
<a>
<a>
<a>
</a>
</a>
</a>
</a>
simon@fedora:~/Documents$ xmllint --version
xmllint: using libxml version 21105
compiled with: Threads Tree Output Push Reader Patterns Writer SAXv1 FTP HTTP DTDValid HTML Legacy C14N Catalog XPath XPointer XInclude Iconv ISO8859X Unicode Regexps Automata Schemas Schematron Modules Debug Zlib Lzma
simon@fedora:~/Documents
```