1. 23 Jan, 2018 1 commit
    • Nick Wellnhofer's avatar
      Fix xmlParserEntityCheck · 707ad080
      Nick Wellnhofer authored
      A previous commit removed the check for XML_ERR_ENTITY_LOOP which is
      required to abort early in case of excessive entity recursion.
      707ad080
  2. 22 Jan, 2018 2 commits
  3. 08 Jan, 2018 1 commit
  4. 08 Dec, 2017 1 commit
  5. 27 Nov, 2017 1 commit
    • Nick Wellnhofer's avatar
      Fix libz and liblzma detection · cb5541c9
      Nick Wellnhofer authored
      If libz or liblzma are detected with pkg-config, AC_CHECK_HEADERS must
      not be run because the correct CPPFLAGS aren't set. It is actually not
      required have separate checks for LIBXML_ZLIB_ENABLED and HAVE_ZLIB_H.
      Only check for LIBXML_ZLIB_ENABLED and remove HAVE_ZLIB_H macro.
      
      Fixes bug 764657, bug 787041.
      cb5541c9
  6. 09 Nov, 2017 2 commits
    • Nick Wellnhofer's avatar
      Fix hash callback signatures · e03f0a19
      Nick Wellnhofer authored
      Make sure that all parameters and return values of hash callback
      functions exactly match the callback function type. This is required
      to pass clang's Control Flow Integrity checks and to allow compilation
      to asm.js with Emscripten.
      
      Fixes bug 784861.
      e03f0a19
    • Vlad Tsyrklevich's avatar
      Refactor name and type signature for xmlNop · 28f52fe8
      Vlad Tsyrklevich authored
      Update xmlNop's name to xmlInputReadCallbackNop and its type signature
      to match xmlInputReadCallback.
      
      Fixes bug 786134.
      28f52fe8
  7. 09 Oct, 2017 2 commits
    • Nick Wellnhofer's avatar
      Fix the Windows header mess · e3890546
      Nick Wellnhofer authored
      Don't include windows.h and wsockcompat.h from config.h but only when
      needed.
      
      Don't define _WINSOCKAPI_ manually. This was apparently done to stop
      windows.h from including winsock.h which is a problem if winsock2.h
      wasn't included first. But on MinGW, this causes compiler warnings.
      Define WIN32_LEAN_AND_MEAN instead which has the same effect.
      
      Always use the compiler-defined _WIN32 macro instead of WIN32.
      e3890546
    • Nick Wellnhofer's avatar
      Fix pointer/int cast warnings on 64-bit Windows · d422b954
      Nick Wellnhofer authored
      On 64-bit Windows, `long` is 32 bits wide and can't hold a pointer.
      Switch to ptrdiff_t instead which should be the same size as a pointer
      on every somewhat sane platform without requiring C99 types like
      intptr_t.
      
      Fixes bug 788312.
      
      Thanks to J. Peter Mugaas for the report and initial patch.
      d422b954
  8. 19 Sep, 2017 1 commit
  9. 13 Sep, 2017 1 commit
    • Nick Wellnhofer's avatar
      Handle more invalid entity values in recovery mode · abbda93c
      Nick Wellnhofer authored
      In attribute content, don't emit entity references if there are
      problems with the entity value. Otherwise some illegal entity values
      like
      
          <!ENTITY a '&#38;#x123456789;'>
      
      would later cause problems like integer overflow.
      
      Make xmlStringLenDecodeEntities return NULL on more error conditions
      including invalid char refs and errors from recursive calls. Remove
      some fragile error checks based on lastError that shouldn't be
      needed now. Clear the entity content in xmlParseAttValueComplex if
      an error was found.
      
      Found by OSS-Fuzz. Should fix bug 783052.
      
      Also see https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3343
      abbda93c
  10. 07 Sep, 2017 1 commit
  11. 30 Aug, 2017 1 commit
  12. 28 Aug, 2017 1 commit
    • Stéphane Michaut's avatar
      Porting libxml2 on zOS encoding of code · 454e397e
      Stéphane Michaut authored
      First set of patches for zOS
      - entities.c parser.c tree.c xmlschemas.c xmlschemastypes.c xpath.c xpointer.c:
        ask conversion of code to ISO Latin 1 to avoid having the compiler assume
        EBCDIC codepoint for characters.
      - xmlmodule.c: make sure we have support for modules
      - xmlIO.c: zOS path names are special avoid dsome of the expectstions from
        Unix/Windows
      454e397e
  13. 25 Jul, 2017 1 commit
  14. 04 Jul, 2017 1 commit
  15. 20 Jun, 2017 6 commits
    • Nick Wellnhofer's avatar
      Fix NULL deref in xmlParseExternalEntityPrivate · 3eef3f39
      Nick Wellnhofer authored
      If called from xmlParseExternalEntity, oldctxt is NULL which leads to
      a NULL deref if an error occurs. This only affects external code that
      calls xmlParseExternalEntity.
      
      Patch from David Kilzer with minor changes.
      
      Fixes bug 780159.
      3eef3f39
    • Nick Wellnhofer's avatar
      Get rid of "blanks wrapper" for parameter entities · 872fea94
      Nick Wellnhofer authored
      Now that replacement of parameter entities goes exclusively through
      xmlSkipBlankChars, we can account for the surrounding space characters
      there and remove the "blanks wrapper" hack.
      872fea94
    • Nick Wellnhofer's avatar
      Make sure not to call IS_BLANK_CH when parsing the DTD · d9e43c7d
      Nick Wellnhofer authored
      This is required to get rid of the "blanks wrapper" hack. Checking the
      return value of xmlSkipBlankChars is more efficient, too.
      d9e43c7d
    • Nick Wellnhofer's avatar
      Remove unnecessary calls to xmlPopInput · 453dff1e
      Nick Wellnhofer authored
      It's enough if xmlPopInput is called from xmlSkipBlankChars. Since the
      replacement text of a parameter entity is surrounded with space
      characters, that's the only place where the replacement can end in a
      well-formed document.
      
      This is also required to get rid of the "blanks wrapper" hack.
      453dff1e
    • Nick Wellnhofer's avatar
      Simplify handling of parameter entity references · aa267cd1
      Nick Wellnhofer authored
      There are only two places where parameter entity references must be
      handled. For the internal subset in xmlParseInternalSubset. For the
      external subset or content from other external PEs in xmlSkipBlankChars.
      
      Make sure that xmlSkipBlankChars skips over sequences of PEs and
      whitespace. Rely on xmlSkipBlankChars instead of calling
      xmlParsePEReference directly when in the external subset or a
      conditional section.
      
      xmlParserHandlePEReference is unused now.
      aa267cd1
    • Nick Wellnhofer's avatar
      Fix xmlHaltParser · 24246c76
      Nick Wellnhofer authored
      Pop all extra input streams before resetting the input. Otherwise,
      a call to xmlPopInput could make input available again.
      
      Also set input->end to input->cur.
      
      Changes the test output for some error tests. Unfortunately, some
      fuzzed test cases were added to the test suite without manual cleanup.
      This makes it almost impossible to review the impact of later changes
      on the test output.
      24246c76
  16. 17 Jun, 2017 4 commits
    • Nick Wellnhofer's avatar
      Spelling and grammar fixes · 8bbe4508
      Nick Wellnhofer authored
      Fixes bug 743172, bug 743489, bug 769632, bug 782400 and a few other
      misspellings.
      8bbe4508
    • Nick Wellnhofer's avatar
      Rework entity boundary checks · 5f440d8c
      Nick Wellnhofer authored
      Make sure to finish all entities in the internal subset. Nevertheless,
      readd a sanity check in xmlParseStartTag2 that was lost in my previous
      commit. Also add a sanity check in xmlPopInput. Popping an input
      unexpectedly was the source of many recent memory bugs. The check
      doesn't mitigate such issues but helps with diagnosis.
      
      Always base entity boundary checks on the input ID, not the input
      pointer. The pointer could have been reallocated to the old address.
      
      Always throw a well-formedness error if a boundary check fails. In a
      few places, a validity error was thrown.
      
      Fix a few error codes and improve indentation.
      5f440d8c
    • Nick Wellnhofer's avatar
      Don't switch encoding for internal parameter entities · 46dc9890
      Nick Wellnhofer authored
      This is only needed for external entities. Trying to switch the encoding
      for internal entities could also cause a memory leak in recovery mode.
      46dc9890
    • Nick Wellnhofer's avatar
      Merge duplicate code paths handling PE references · 03904159
      Nick Wellnhofer authored
      xmlParsePEReference is essentially a subset of
      xmlParserHandlePEReference, so make xmlParserHandlePEReference call
      xmlParsePEReference. The code paths in these functions differed
      slighty, but the code from xmlParserHandlePEReference seems more solid
      and tested.
      03904159
  17. 16 Jun, 2017 1 commit
    • David Kilzer's avatar
      Fix duplicate SAX callbacks for entity content · 3f0627a1
      David Kilzer authored
      Reset 'was_checked' to prevent entity from being parsed twice and SAX
      callbacks being invoked twice if XML_PARSE_NOENT was set.
      
      This regressed in version 2.9.3 and caused problems with WebKit.
      
      Fixes bug 760367.
      3f0627a1
  18. 10 Jun, 2017 3 commits
    • Nick Wellnhofer's avatar
      Fix potential infinite loop in xmlStringLenDecodeEntities · fb2f518c
      Nick Wellnhofer authored
      Make sure that xmlParseStringPEReference advances the "str" pointer
      even if the parser was stopped. Otherwise xmlStringLenDecodeEntities
      can loop infinitely.
      fb2f518c
    • Nick Wellnhofer's avatar
      Remove useless check in xmlParseAttributeListDecl · 4ba8cc85
      Nick Wellnhofer authored
      Since we already successfully parsed the attribute name and other
      items, it is guaranteed that we made progress in the input stream.
      
      Comparing the input pointer to a previous value also looks fragile to
      me. What if the input buffer was reallocated and the new "cur" pointer
      happens to be the same as the old one? There are a couple of similar
      checks which also take "consumed" into account. This seems to be safer
      but I'm not convinced that it couldn't lead to false alarms in rare
      situations.
      4ba8cc85
    • Nick Wellnhofer's avatar
      Fix memory leak in xmlParseEntityDecl error path · bedbef80
      Nick Wellnhofer authored
      When parsing the entity value, it can happen that an external entity
      with an unsupported encoding is loaded and the parser is stopped. This
      would lead to a memory leak.
      
      A custom SAX callback could also stop the parser.
      
      Found with libFuzzer and ASan.
      bedbef80
  19. 06 Jun, 2017 1 commit
  20. 05 Jun, 2017 1 commit
    • Nick Wellnhofer's avatar
      Fix handling of parameter-entity references · e2663054
      Nick Wellnhofer authored
      There were two bugs where parameter-entity references could lead to an
      unexpected change of the input buffer in xmlParseNameComplex and
      xmlDictLookup being called with an invalid pointer.
      
      Percent sign in DTD Names
      =========================
      
      The NEXTL macro used to call xmlParserHandlePEReference. When parsing
      "complex" names inside the DTD, this could result in entity expansion
      which created a new input buffer. The fix is to simply remove the call
      to xmlParserHandlePEReference from the NEXTL macro. This is safe because
      no users of the macro require expansion of parameter entities.
      
      - xmlParseNameComplex
      - xmlParseNCNameComplex
      - xmlParseNmtoken
      
      The percent sign is not allowed in names, which are grammatical tokens.
      
      - xmlParseEntityValue
      
      Parameter-entity references in entity values are expanded but this
      happens in a separate step in this function.
      
      - xmlParseSystemLiteral
      
      Parameter-entity references are ignored in the system literal.
      
      - xmlParseAttValueComplex
      - xmlParseCharDataComplex
      - xmlParseCommentComplex
      - xmlParsePI
      - xmlParseCDSect
      
      Parameter-entity references are ignored outside the DTD.
      
      - xmlLoadEntityContent
      
      This function is only called from xmlStringLenDecodeEntities and
      entities are replaced in a separate step immediately after the function
      call.
      
      This bug could also be triggered with an internal subset and double
      entity expansion.
      
      This fixes bug 766956 initially reported by Wei Lei and independently by
      Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone
      involved.
      
      xmlParseNameComplex with XML_PARSE_OLD10
      ========================================
      
      When parsing Names inside an expanded parameter entity with the
      XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the
      GROW macro if the input buffer was exhausted. At the end of the
      parameter entity's replacement text, this function would then call
      xmlPopInput which invalidated the input buffer.
      
      There should be no need to invoke GROW in this situation because the
      buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and,
      at least for UTF-8, in xmlCurrentChar. This also matches the code path
      executed when XML_PARSE_OLD10 is not set.
      
      This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050).
      Thanks to Marcel Böhme and Thuan Pham for the report.
      
      Additional hardening
      ====================
      
      A separate check was added in xmlParseNameComplex to validate the
      buffer size.
      e2663054
  21. 01 Jun, 2017 3 commits
    • Nick Wellnhofer's avatar
      Avoid reparsing in xmlParseStartTag2 · 855c19ef
      Nick Wellnhofer authored
      The code in xmlParseStartTag2 must handle the case that the input
      buffer was grown and reallocated which can invalidate pointers to
      attribute values. Before, this was handled by detecting changes of
      the input buffer "base" pointer and, in case of a change, jumping
      back to the beginning of the function and reparsing the start tag.
      
      The major problem of this approach is that whether an input buffer is
      reallocated is nondeterministic, resulting in seemingly random test
      failures. See the mailing list thread "runtest mystery bug: name2.xml
      error case regression test" from 2012, for example.
      
      If a reallocation was detected, the code also made no attempts to
      continue parsing in case of errors which makes a difference in
      the lax "recover" mode.
      
      Now we store the current input buffer "base" pointer for each (not
      separately allocated) attribute in the namespace URI field, which isn't
      used until later. After the whole start tag was parsed, the pointers
      to the attribute values are reconstructed using the offset between the
      new and the old input buffer. This relies on arithmetic on dangling
      pointers which is technically undefined behavior. But it seems like
      the easiest and most efficient fix and a similar approach is used in
      xmlParserInputGrow.
      
      This changes the error output of several tests, typically making it
      more verbose because we try harder to continue parsing in case of
      errors.
      
      (Another possible solution is to check not only the "base" pointer
      but the size of the input buffer as well. But this would result in
      even more reparsing.)
      855c19ef
    • Nick Wellnhofer's avatar
      Simplify control flow in xmlParseStartTag2 · 07b7428b
      Nick Wellnhofer authored
      Remove some goto labels and deduplicate a bit of code after handling
      namespaces.
      
      Before:
      
          loop {
              parseAttribute
              if (ok) {
                  if (defaultNamespace) {
                      handleDefaultNamespace
                      if (error)
                          goto skip_default_ns;
                      handleDefaultNamespace
          skip_default_ns:
                      freeAttr
                      nextAttr
                      continue;
                  }
                  if (namespace) {
                      handleNamespace
                      if (error)
                          goto skip_ns;
                      handleNamespace
          skip_ns:
                      freeAttr
                      nextAttr;
                      continue;
                  }
                  handleAttr
              } else {
                  freeAttr
              }
              nextAttr
          }
      
      After:
      
          loop {
              parseAttribute
              if (!ok)
                  goto next_attr;
              if (defaultNamespace) {
                  handleDefaultNamespace
                  if (error)
                      goto next_attr;
                  handleDefaultNamespace
              } else if (namespace) {
                  handleNamespace
                  if (error)
                      goto next_attr;
                  handleNamespace
              } else {
                  handleAttr
              }
          next_attr:
              freeAttr
              nextAttr
          }
      07b7428b
    • Nick Wellnhofer's avatar
      Avoid spurious UBSan errors in parser.c · 47496724
      Nick Wellnhofer authored
      If available, use a C99 flexible array member to avoid spurious UBSan
      errors.
      47496724
  22. 27 May, 2017 1 commit
    • Nick Wellnhofer's avatar
      Fix memory leak in parser error path · 8627e4ed
      Nick Wellnhofer authored
      Triggered in mixed content ELEMENT declarations if there's an invalid
      name after the first valid name:
      
          <!ELEMENT para (#PCDATA|a|<invalid>)*>
      
      Found with libFuzzer and ASan.
      8627e4ed
  23. 07 Apr, 2017 2 commits
  24. 23 May, 2016 1 commit
    • Daniel Veillard's avatar
      Avoid building recursive entities · bdd66182
      Daniel Veillard authored
      For https://bugzilla.gnome.org/show_bug.cgi?id=762100
      
      When we detect a recusive entity we should really not
      build the associated data, moreover if someone bypass
      libxml2 fatal errors and still tries to serialize a broken
      entity make sure we don't risk to get ito a recursion
      
      * parser.c: xmlParserEntityCheck() don't build if entity loop
        were found and remove the associated text content
      * tree.c: xmlStringGetNodeList() avoid a potential recursion
      bdd66182