1. 06 Jul, 2021 1 commit
    • Arne Becker's avatar
      Patch to forbid epsilon-reduction of final states · ec6e3efb
      Arne Becker authored and Nick Wellnhofer's avatar Nick Wellnhofer committed
      When building the internal representation of a regexp, it is possible
      that a lot of empty transitions are created. Therefore there is a step
      to reduce them in the function xmlFAEliminateSimpleEpsilonTransitions.
      There is an error there for this case:
      * State 1 has a transition with an atom (in this case "a") to state 2.
      * State 2 is final and has an epsilon transition to state 1.
      After reduction it looked like:
      * State 1 has a transition with an atom (in this case "a") to itself
        and is final.
      In other words, the empty string is accepted when it shouldn't be.
      The attached patch skips the reduction step for final states.
      An alternative would be to insert or increment counters when reducing a
      final state, but this seemed error prone and unnecessary, since there
      aren't that many final states.
      Fixes #282
  2. 07 Jun, 2021 2 commits
  3. 02 Jun, 2021 1 commit
    • Nick Wellnhofer's avatar
      Fix XPath recursion limit · 3e1aad4f
      Nick Wellnhofer authored
      Fix accounting of recursion depth when parsing XPath expressions.
      This silly bug introduced in commit 804c5297 could lead to spurious
      errors when parsing larger expressions or XSLT documents.
      Should fix #264.
  4. 25 May, 2021 1 commit
  5. 23 May, 2021 4 commits
  6. 21 May, 2021 3 commits
  7. 20 May, 2021 1 commit
  8. 13 May, 2021 3 commits
    • Daniel Veillard's avatar
      Release of libxml2-2.9.12 · b48e77cf
      Daniel Veillard authored
      Brown paper bag release, some recently added sources were missing from
      the 2.9.11 tarball:
      - configure.ac: bump version
      - fuzz/Makefile.am: add fuzz.h and seed/regexp to EXTRA_DIST
    • Daniel Veillard's avatar
      Release of libxml2-2.9.11 · e1bcffea
      Daniel Veillard authored
      Prompted by CVE-2021-3541, but this includes an awful lot of serious bug
      fixes by Nick and others.
      - configure.ac: bumped to new release
      - doc/* updated and regenerated
    • Daniel Veillard's avatar
      Patch for security issue CVE-2021-3541 · 8598060b
      Daniel Veillard authored
      This is relapted to parameter entities expansion and following
      the line of the billion laugh attack. Somehow in that path the
      counting of parameters was missed and the normal algorithm based
      on entities "density" was useless.
  9. 09 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Fix null deref in legacy SAX1 parser · bfd2f430
      Nick Wellnhofer authored
      Always call nameNsPush instead of namePush. The latter is unused now
      and should probably be removed from the public API. I can't see how
      it could be used reasonably from client code and the unprefixed name
      has always polluted the global namespace.
      Fixes a null pointer dereference introduced with de5b624f when parsing
      in SAX1 mode.
      Found by OSS-Fuzz.
  10. 08 May, 2021 2 commits
    • Nick Wellnhofer's avatar
      Store per-element parser state in a struct · ce00c36e
      Nick Wellnhofer authored
      Make the parser context's "pushTab" point to an array of structs
      instead of void pointers. This avoids casting unrelated types to void
      pointers, improving readability and portability, and allows for more
      efficient packing. Ultimately, the struct could be extended to include
      the contents of "nameTab" and "spaceTab", further simplifying the code.
      Historically, "pushTab" was only used by the push parser (hence the
      name), so the change to the public headers should be safe.
      Also remove an unused parameter from xmlParseEndTag2.
    • Nick Wellnhofer's avatar
      Fix handling of unexpected EOF in xmlParseContent · de5b624f
      Nick Wellnhofer authored
      Readd the XML_ERR_TAG_NOT_FINISHED error on unexpected EOF which was
      removed in commit 62150ed2.
      This commit also introduced a regression for direct users of
      xmlParseContent. Unclosed tags weren't checked.
  11. 07 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Fix line numbers in error messages for mismatched tags · 3e80560d
      Nick Wellnhofer authored
      Commit 62150ed2 introduced a small regression in the error messages for
      mismatched tags. This typically only affected messages after the first
      mismatch, but with custom SAX handlers all line numbers would be off.
      This also fixes line numbers in the SAX push parser which were never
      handled correctly.
  12. 06 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Fix htmlTagLookup · 7279d236
      Nick Wellnhofer authored
      Fix regression introduced with b25acce8. Some users like libxslt may
      call the HTML output functions on documents with uppercase tag names,
      so we must keep case-insensitive string comparison.
      Fixes #248.
  13. 03 May, 2021 1 commit
  14. 01 May, 2021 1 commit
    • Nick Wellnhofer's avatar
      Propagate error in xmlParseElementChildrenContentDeclPriv · babe7503
      Nick Wellnhofer authored
      Check return value of recursive calls to
      xmlParseElementChildrenContentDeclPriv and return immediately in case
      of errors. Otherwise, struct xmlElementContent could contain unexpected
      null pointers, leading to a null deref when post-validating documents
      which aren't well-formed and parsed in recovery mode.
      Fixes #243.
  15. 25 Apr, 2021 1 commit
  16. 22 Apr, 2021 3 commits
  17. 21 Apr, 2021 1 commit
  18. 20 Mar, 2021 1 commit
  19. 16 Mar, 2021 2 commits
  20. 13 Mar, 2021 3 commits
  21. 04 Mar, 2021 3 commits
  22. 02 Mar, 2021 2 commits
    • Nick Wellnhofer's avatar
      Clarify xmlNewDocProp documentation · ad101bb5
      Nick Wellnhofer authored
    • Nick Wellnhofer's avatar
      Stop checking attributes for UTF-8 validity · a6e6498f
      Nick Wellnhofer authored
      I can't see a reason to check attribute content for UTF-8 validity.
      Other parts of the API like xmlNewText have always assumed valid UTF-8
      as extra checks only slow down processing.
      Besides, setting doc->encoding to "ISO-8859-1" seems pointless, and not
      freeing the old encoding would cause a memory leak.
      Note that this was last changed in 2008 with commit 6f8611fd which
      removed unnecessary encoding/decoding steps. Setting attributes should
      be even faster now.
      Found by OSS-Fuzz.
  23. 01 Mar, 2021 1 commit
    • Nick Wellnhofer's avatar
      Reduce some fuzzer timeouts · 8446d459
      Nick Wellnhofer authored
      OSS-Fuzz has been fuzzing the HTML parser with inputs up to 1 MB for
      several hundred hours without hitting the 20s timeout. It seems that
      most timeouts resulting from accidentally quadratic behavior in the
      HTML parser have been fixed. Start to gradually reduce the timeout to
      find new performance issues.