1. 13 May, 2021 2 commits
    • Daniel Veillard's avatar
      Release of libxml2-2.9.11 · e1bcffea
      Daniel Veillard authored
      Prompted by CVE-2021-3541, but this includes an awful lot of serious bug
      fixes by Nick and others.
      - configure.ac: bumped to new release
      - doc/* updated and regenerated
    • Daniel Veillard's avatar
      Patch for security issue CVE-2021-3541 · 8598060b
      Daniel Veillard authored
      This is relapted to parameter entities expansion and following
      the line of the billion laugh attack. Somehow in that path the
      counting of parameters was missed and the normal algorithm based
      on entities "density" was useless.
  2. 09 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Fix null deref in legacy SAX1 parser · bfd2f430
      Nick Wellnhofer authored
      Always call nameNsPush instead of namePush. The latter is unused now
      and should probably be removed from the public API. I can't see how
      it could be used reasonably from client code and the unprefixed name
      has always polluted the global namespace.
      Fixes a null pointer dereference introduced with de5b624f when parsing
      in SAX1 mode.
      Found by OSS-Fuzz.
  3. 08 May, 2021 2 commits
    • Nick Wellnhofer's avatar
      Store per-element parser state in a struct · ce00c36e
      Nick Wellnhofer authored
      Make the parser context's "pushTab" point to an array of structs
      instead of void pointers. This avoids casting unrelated types to void
      pointers, improving readability and portability, and allows for more
      efficient packing. Ultimately, the struct could be extended to include
      the contents of "nameTab" and "spaceTab", further simplifying the code.
      Historically, "pushTab" was only used by the push parser (hence the
      name), so the change to the public headers should be safe.
      Also remove an unused parameter from xmlParseEndTag2.
    • Nick Wellnhofer's avatar
      Fix handling of unexpected EOF in xmlParseContent · de5b624f
      Nick Wellnhofer authored
      Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was
      removed in commit 62150ed2.
      This commit also introduced a regression for direct users of
      xmlParseContent. Unclosed tags weren't checked.
  4. 07 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Fix line numbers in error messages for mismatched tags · 3e80560d
      Nick Wellnhofer authored
      Commit 62150ed2 introduced a small regression in the error messages for
      mismatched tags. This typically only affected messages after the first
      mismatch, but with custom SAX handlers all line numbers would be off.
      This also fixes line numbers in the SAX push parser which were never
      handled correctly.
  5. 06 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Fix htmlTagLookup · 7279d236
      Nick Wellnhofer authored
      Fix regression introduced with b25acce8. Some users like libxslt may
      call the HTML output functions on documents with uppercase tag names,
      so we must keep case-insensitive string comparison.
      Fixes #248.
  6. 03 May, 2021 1 commit
  7. 01 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Propagate error in xmlParseElementChildrenContentDeclPriv · babe7503
      Nick Wellnhofer authored
      Check return value of recursive calls to
      xmlParseElementChildrenContentDeclPriv and return immediately in case
      of errors. Otherwise, struct xmlElementContent could contain unexpected
      null pointers, leading to a null deref when post-validating documents
      which aren't well-formed and parsed in recovery mode.
      Fixes #243.
  8. 25 Apr, 2021 1 commit
  9. 22 Apr, 2021 3 commits
  10. 21 Apr, 2021 1 commit
  11. 20 Mar, 2021 1 commit
  12. 16 Mar, 2021 2 commits
  13. 13 Mar, 2021 3 commits
  14. 04 Mar, 2021 3 commits
  15. 02 Mar, 2021 2 commits
    • Nick Wellnhofer's avatar
      Clarify xmlNewDocProp documentation · ad101bb5
      Nick Wellnhofer authored
    • Nick Wellnhofer's avatar
      Stop checking attributes for UTF-8 validity · a6e6498f
      Nick Wellnhofer authored
      I can't see a reason to check attribute content for UTF-8 validity.
      Other parts of the API like xmlNewText have always assumed valid UTF-8
      as extra checks only slow down processing.
      Besides, setting doc->encoding to "ISO-8859-1" seems pointless, and not
      freeing the old encoding would cause a memory leak.
      Note that this was last changed in 2008 with commit 6f8611fd which
      removed unnecessary encoding/decoding steps. Setting attributes should
      be even faster now.
      Found by OSS-Fuzz.
  16. 01 Mar, 2021 2 commits
    • Nick Wellnhofer's avatar
      Reduce some fuzzer timeouts · 8446d459
      Nick Wellnhofer authored
      OSS-Fuzz has been fuzzing the HTML parser with inputs up to 1 MB for
      several hundred hours without hitting the 20s timeout. It seems that
      most timeouts resulting from accidentally quadratic behavior in the
      HTML parser have been fixed. Start to gradually reduce the timeout to
      find new performance issues.
    • Nick Wellnhofer's avatar
      Fix quadratic behavior when looking up xml:* attributes · 688b41a0
      Nick Wellnhofer authored
      Add a special case for the predefined XML namespace when looking up DTD
      attribute defaults in xmlGetPropNodeInternal to avoid calling
      This fixes quadratic behavior in
      - xmlNodeGetBase
      - xmlNodeGetLang
      - xmlNodeGetSpacePreserve
      Found by OSS-Fuzz.
  17. 22 Feb, 2021 8 commits
  18. 21 Feb, 2021 1 commit
  19. 20 Feb, 2021 3 commits
    • Nick Wellnhofer's avatar
      Fix slow parsing of HTML with encoding errors · dcb80b92
      Nick Wellnhofer authored
      Under certain circumstances, the HTML parser would try to guess and
      switch input encodings multiple times, leading to slow processing of
      documents with encoding errors. The repeated scanning of the input
      buffer when guessing encodings could even lead to quadratic behavior.
      The code htmlCurrentChar probably assumed that if there's an encoding
      handler, it is guaranteed to produce valid UTF-8. This holds true in
      general, but if the detected encoding was "UTF-8", the UTF8ToUTF8
      encoding handler simply invoked memcpy without checking for invalid
      UTF-8. This still must be fixed, preferably by not using this handler
      at all.
      Also leave a note that switching encodings twice seems impossible to
      implement correctly. Add a check when handling UTF-8 encoding errors
      in htmlCurrentChar to avoid this situation, even if encoders produce
      invalid UTF-8.
      Found by OSS-Fuzz.
    • hhb's avatar
      Add a flag to not output anything when xmllint succeeded · 02bee4c4
      hhb authored and Nick Wellnhofer's avatar Nick Wellnhofer committed
    • Simon Josefsson's avatar
      Fix warnings in libxml.m4 with autoconf 2.70+. · 4defa2c2
      Simon Josefsson authored and Nick Wellnhofer's avatar Nick Wellnhofer committed
      Closes #219.
  20. 09 Feb, 2021 1 commit