- 13 May, 2021 2 commits
-
-
Daniel Veillard authored
Prompted by CVE-2021-3541, but this includes an awful lot of serious bug fixes by Nick and others. - configure.ac: bumped to new release - doc/* updated and regenerated
-
Daniel Veillard authored
This is relapted to parameter entities expansion and following the line of the billion laugh attack. Somehow in that path the counting of parameters was missed and the normal algorithm based on entities "density" was useless.
-
- 09 May, 2021 1 commit
-
-
Nick Wellnhofer authored
Always call nameNsPush instead of namePush. The latter is unused now and should probably be removed from the public API. I can't see how it could be used reasonably from client code and the unprefixed name has always polluted the global namespace. Fixes a null pointer dereference introduced with de5b624f when parsing in SAX1 mode. Found by OSS-Fuzz.
-
- 08 May, 2021 2 commits
-
-
Nick Wellnhofer authored
Make the parser context's "pushTab" point to an array of structs instead of void pointers. This avoids casting unrelated types to void pointers, improving readability and portability, and allows for more efficient packing. Ultimately, the struct could be extended to include the contents of "nameTab" and "spaceTab", further simplifying the code. Historically, "pushTab" was only used by the push parser (hence the name), so the change to the public headers should be safe. Also remove an unused parameter from xmlParseEndTag2.
-
Nick Wellnhofer authored
Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was removed in commit 62150ed2. This commit also introduced a regression for direct users of xmlParseContent. Unclosed tags weren't checked.
-
- 07 May, 2021 1 commit
-
-
Nick Wellnhofer authored
Commit 62150ed2 introduced a small regression in the error messages for mismatched tags. This typically only affected messages after the first mismatch, but with custom SAX handlers all line numbers would be off. This also fixes line numbers in the SAX push parser which were never handled correctly.
-
- 06 May, 2021 1 commit
-
-
Nick Wellnhofer authored
Fix regression introduced with b25acce8. Some users like libxslt may call the HTML output functions on documents with uppercase tag names, so we must keep case-insensitive string comparison. Fixes #248.
-
- 03 May, 2021 1 commit
-
-
Fixes #242.
-
- 01 May, 2021 1 commit
-
-
Nick Wellnhofer authored
Check return value of recursive calls to xmlParseElementChildrenContentDeclPriv and return immediately in case of errors. Otherwise, struct xmlElementContent could contain unexpected null pointers, leading to a null deref when post-validating documents which aren't well-formed and parsed in recovery mode. Fixes #243.
-
- 25 Apr, 2021 1 commit
-
-
Nick Wellnhofer authored
Fixes #238.
-
- 22 Apr, 2021 3 commits
-
-
Nick Wellnhofer authored
The --dropdtd option can leave dangling pointers in entity reference nodes. Make sure to skip these nodes when processing XIncludes. This also avoids scanning entity declarations and even modifying them inadvertently during XInclude processing. Move from a block list to an allow list approach to avoid descending into other node types that can't contain elements. Fixes #237.
-
Nick Wellnhofer authored
Reset doc->intSubset when dropping the DTD.
-
- 21 Apr, 2021 1 commit
-
-
Nick Wellnhofer authored
Call htmlCtxtUseOptions to make sure that names aren't stored in dictionaries. Note that this issue only affects xmllint using the HTML push parser. Fixes #230.
-
- 20 Mar, 2021 1 commit
-
-
Nick Wellnhofer authored
- Include xmlversion.h before testing feature flags. - Include libxml headers before extern "C". Fixes #226.
-
- 16 Mar, 2021 2 commits
-
-
Chris Degawa authored
Currently, it catches mingw-w64 in there as well, but mingw-w64 follows linux-like naming with no weird postfixes Signed-off-by:
Christopher Degawa <ccom@randomderp.com>
-
Nick Wellnhofer authored
-
- 13 Mar, 2021 3 commits
-
-
Nick Wellnhofer authored
The DBL_MAX approach could lead to errors caused by excess precision. Switch back to the division-by-zero approach with a work-around for MSVC and use the extern globals instead of macro expressions.
-
Nick Wellnhofer authored
Make xmlGetNodePath return NULL instead of invalid XPath when hitting unsupported node types like DTD content. Reported here: https://mail.gnome.org/archives/xml/2021-January/msg00012.html Original report: https://bugs.php.net/bug.php?id=80680
-
Nick Wellnhofer authored
Fix another case where only recursion depth was limited, but entities would still be expanded over and over again. The test case discovered by fuzzing only affected parsing in recovery mode with XML_PARSE_RECOVER. Found by OSS-Fuzz.
-
- 04 Mar, 2021 3 commits
-
-
Nick Wellnhofer authored
-
Nick Wellnhofer authored
Switch to binary search.
-
Nick Wellnhofer authored
Switch to binary search. This is the first time bsearch is used in the libxml2 code base. But it's a standard library function since C89 and should be portable.
-
- 02 Mar, 2021 2 commits
-
-
Nick Wellnhofer authored
-
Nick Wellnhofer authored
I can't see a reason to check attribute content for UTF-8 validity. Other parts of the API like xmlNewText have always assumed valid UTF-8 as extra checks only slow down processing. Besides, setting doc->encoding to "ISO-8859-1" seems pointless, and not freeing the old encoding would cause a memory leak. Note that this was last changed in 2008 with commit 6f8611fd which removed unnecessary encoding/decoding steps. Setting attributes should be even faster now. Found by OSS-Fuzz.
-
- 01 Mar, 2021 2 commits
-
-
Nick Wellnhofer authored
OSS-Fuzz has been fuzzing the HTML parser with inputs up to 1 MB for several hundred hours without hitting the 20s timeout. It seems that most timeouts resulting from accidentally quadratic behavior in the HTML parser have been fixed. Start to gradually reduce the timeout to find new performance issues.
-
Nick Wellnhofer authored
Add a special case for the predefined XML namespace when looking up DTD attribute defaults in xmlGetPropNodeInternal to avoid calling xmlGetNsList. This fixes quadratic behavior in - xmlNodeGetBase - xmlNodeGetLang - xmlNodeGetSpacePreserve Found by OSS-Fuzz.
-
- 22 Feb, 2021 8 commits
-
-
Nick Wellnhofer authored
Only run the following tests by default - gcc - clang:asan - cmake:mingw:w64-x86_64:shared - cmake:msvc:v141:x64:shared
-
Nick Wellnhofer authored
- Add more calls to xmlInitializeCatalog. - Call xmlResetLastError after fuzzing each input.
-
Nick Wellnhofer authored
-
Markus Rickert authored
-
Nick Wellnhofer authored
xmlInitializeCatalog is not called from xmlInitParser.
-
Nick Wellnhofer authored
This reverts commit de1b51ed.
-
Nick Wellnhofer authored
-
Nick Wellnhofer authored
Call htmlInitAutoClose during fuzzer initialization to fix stability issue. Leave a note concerning problems with this function.
-
- 21 Feb, 2021 1 commit
-
-
Markus Rickert authored
-
- 20 Feb, 2021 3 commits
-
-
Nick Wellnhofer authored
Under certain circumstances, the HTML parser would try to guess and switch input encodings multiple times, leading to slow processing of documents with encoding errors. The repeated scanning of the input buffer when guessing encodings could even lead to quadratic behavior. The code htmlCurrentChar probably assumed that if there's an encoding handler, it is guaranteed to produce valid UTF-8. This holds true in general, but if the detected encoding was "UTF-8", the UTF8ToUTF8 encoding handler simply invoked memcpy without checking for invalid UTF-8. This still must be fixed, preferably by not using this handler at all. Also leave a note that switching encodings twice seems impossible to implement correctly. Add a check when handling UTF-8 encoding errors in htmlCurrentChar to avoid this situation, even if encoders produce invalid UTF-8. Found by OSS-Fuzz.
-
-
Closes #219.
-
- 09 Feb, 2021 1 commit
-
-
Nick Wellnhofer authored
Found by OSS-Fuzz.
-