1. 17 Jun, 2017 1 commit
    • Nick Wellnhofer's avatar
      Merge duplicate code paths handling PE references · 03904159
      Nick Wellnhofer authored
      xmlParsePEReference is essentially a subset of
      xmlParserHandlePEReference, so make xmlParserHandlePEReference call
      xmlParsePEReference. The code paths in these functions differed
      slighty, but the code from xmlParserHandlePEReference seems more solid
      and tested.
      03904159
  2. 16 Jun, 2017 3 commits
  3. 12 Jun, 2017 4 commits
    • Nick Wellnhofer's avatar
      Treat URIs with scheme as absolute in C14N · 3939178e
      Nick Wellnhofer authored
      Fixes bug 783656.
      3939178e
    • Nick Wellnhofer's avatar
      Misc fixes for 'make tests' · 67f9f9d6
      Nick Wellnhofer authored
      - Silence test output.
      - Clean up after doc/examples tests.
      - Adjust expected output for script tests.
      - Add missing results for relaxng/pattern3
      
      There are still two test failures I can't comment on:
      
      - regexp/bug316338
      - schemas/any4_0
      67f9f9d6
    • Nick Wellnhofer's avatar
      Initialize keepBlanks in HTML parser · 0b2d5c48
      Nick Wellnhofer authored
      This caused failures in the HTML push tests but the fix required to
      change the expected output of the HTML SAX tests.
      0b2d5c48
    • David Kilzer's avatar
      Add test cases for bug 758518 · 85c112a0
      David Kilzer authored
      test/HTML/758518-entity.html exposed a bug in pushParseTest() in
      runtest.c which assumed that an input file was at least 4 bytes long.
      That test case is only 3 bytes, so we now take the minimum of 4 bytes
      or the length of the test input.  We also now use 'chunkSize' in place
      of the hard-coded value '1024' later in the function.
      85c112a0
  4. 11 Jun, 2017 3 commits
  5. 10 Jun, 2017 8 commits
    • Nick Wellnhofer's avatar
      Print error messages for truncated UTF-8 sequences · 79c8a6b1
      Nick Wellnhofer authored
      Before, truncated UTF-8 sequences at the end of a file were treated as
      EOF. Create an error message containing the offending bytes.
      
      xmlStringCurrentChar would also print characters from the input stream,
      not the string it's working on.
      79c8a6b1
    • Nick Wellnhofer's avatar
      Fix potential infinite loop in xmlStringLenDecodeEntities · fb2f518c
      Nick Wellnhofer authored
      Make sure that xmlParseStringPEReference advances the "str" pointer
      even if the parser was stopped. Otherwise xmlStringLenDecodeEntities
      can loop infinitely.
      fb2f518c
    • Nick Wellnhofer's avatar
      Remove useless check in xmlParseAttributeListDecl · 4ba8cc85
      Nick Wellnhofer authored
      Since we already successfully parsed the attribute name and other
      items, it is guaranteed that we made progress in the input stream.
      
      Comparing the input pointer to a previous value also looks fragile to
      me. What if the input buffer was reallocated and the new "cur" pointer
      happens to be the same as the old one? There are a couple of similar
      checks which also take "consumed" into account. This seems to be safer
      but I'm not convinced that it couldn't lead to false alarms in rare
      situations.
      4ba8cc85
    • Nick Wellnhofer's avatar
      Reset parser input pointers on encoding failure · f9e7997e
      Nick Wellnhofer authored
      Call xmlBufResetInput before bailing out if switching the encoding
      fails. Otherwise, the input pointers are left in an invalid state.
      This would typically lead to an internal error in xmlGROW but could also
      cause other unforeseen problems.
      f9e7997e
    • Nick Wellnhofer's avatar
      Fix memory leak in xmlParseEntityDecl error path · bedbef80
      Nick Wellnhofer authored
      When parsing the entity value, it can happen that an external entity
      with an unsupported encoding is loaded and the parser is stopped. This
      would lead to a memory leak.
      
      A custom SAX callback could also stop the parser.
      
      Found with libFuzzer and ASan.
      bedbef80
    • Nick Wellnhofer's avatar
      Allow zero sized memory input buffers · 94f6ce83
      Nick Wellnhofer authored
      Useful for a fuzz target I'm working on.
      94f6ce83
    • Nick Wellnhofer's avatar
      Fix xmlBuildRelativeURI for URIs starting with './' · 91e54967
      Nick Wellnhofer authored
      If the relative URI started with './', the 'pos' index was increased
      which also affected indexing into the base path. Aside from producing
      wrong results, this could also lead to a heap overread of the base
      path buffer. The data read from beyond the buffer was only compared
      to some char values, so this is mostly harmless.
      
      Inside libxml2, xmlBuildRelativeURI is only called from xinclude.c.
      
      Found with libFuzzer and ASan.
      91e54967
    • Nick Wellnhofer's avatar
      Add TODO comment in xmlSwitchEncoding · 45ce1ee3
      Nick Wellnhofer authored
      It would be nice if we could recover from unsupported encodings in
      external entities.
      45ce1ee3
  6. 07 Jun, 2017 5 commits
  7. 06 Jun, 2017 2 commits
  8. 05 Jun, 2017 2 commits
    • Nick Wellnhofer's avatar
      Fix buffer size checks in xmlSnprintfElementContent · 932cc989
      Nick Wellnhofer authored
      xmlSnprintfElementContent failed to correctly check the available
      buffer space in two locations.
      
      Fixes bug 781333 (CVE-2017-9047) and bug 781701 (CVE-2017-9048).
      
      Thanks to Marcel Böhme and Thuan Pham for the report.
      932cc989
    • Nick Wellnhofer's avatar
      Fix handling of parameter-entity references · e2663054
      Nick Wellnhofer authored
      There were two bugs where parameter-entity references could lead to an
      unexpected change of the input buffer in xmlParseNameComplex and
      xmlDictLookup being called with an invalid pointer.
      
      Percent sign in DTD Names
      =========================
      
      The NEXTL macro used to call xmlParserHandlePEReference. When parsing
      "complex" names inside the DTD, this could result in entity expansion
      which created a new input buffer. The fix is to simply remove the call
      to xmlParserHandlePEReference from the NEXTL macro. This is safe because
      no users of the macro require expansion of parameter entities.
      
      - xmlParseNameComplex
      - xmlParseNCNameComplex
      - xmlParseNmtoken
      
      The percent sign is not allowed in names, which are grammatical tokens.
      
      - xmlParseEntityValue
      
      Parameter-entity references in entity values are expanded but this
      happens in a separate step in this function.
      
      - xmlParseSystemLiteral
      
      Parameter-entity references are ignored in the system literal.
      
      - xmlParseAttValueComplex
      - xmlParseCharDataComplex
      - xmlParseCommentComplex
      - xmlParsePI
      - xmlParseCDSect
      
      Parameter-entity references are ignored outside the DTD.
      
      - xmlLoadEntityContent
      
      This function is only called from xmlStringLenDecodeEntities and
      entities are replaced in a separate step immediately after the function
      call.
      
      This bug could also be triggered with an internal subset and double
      entity expansion.
      
      This fixes bug 766956 initially reported by Wei Lei and independently by
      Chromium's ClusterFuzz, Hanno Böck, and Marco Grassi. Thanks to everyone
      involved.
      
      xmlParseNameComplex with XML_PARSE_OLD10
      ========================================
      
      When parsing Names inside an expanded parameter entity with the
      XML_PARSE_OLD10 option, xmlParseNameComplex would call xmlGROW via the
      GROW macro if the input buffer was exhausted. At the end of the
      parameter entity's replacement text, this function would then call
      xmlPopInput which invalidated the input buffer.
      
      There should be no need to invoke GROW in this situation because the
      buffer is grown periodically every XML_PARSER_CHUNK_SIZE characters and,
      at least for UTF-8, in xmlCurrentChar. This also matches the code path
      executed when XML_PARSE_OLD10 is not set.
      
      This fixes bugs 781205 (CVE-2017-9049) and 781361 (CVE-2017-9050).
      Thanks to Marcel Böhme and Thuan Pham for the report.
      
      Additional hardening
      ====================
      
      A separate check was added in xmlParseNameComplex to validate the
      buffer size.
      e2663054
  9. 01 Jun, 2017 10 commits
    • Nick Wellnhofer's avatar
      Check for integer overflow in xmlXPathFormatNumber · 7482f41f
      Nick Wellnhofer authored
      Check for overflow before casting double to int.
      
      Found with afl-fuzz and UBSan.
      7482f41f
    • Nick Wellnhofer's avatar
      Make Travis print UBSan stacktraces · 863b5792
      Nick Wellnhofer authored
      863b5792
    • Nick Wellnhofer's avatar
      Add .travis.yml · a2b53178
      Nick Wellnhofer authored
      For now this is mainly useful if you work on a fork of the libxml2
      mirror on GitHub:
      
          https://github.com/GNOME/libxml2
      
      Start with two build setups:
      
      - GCC with as many GNU extensions disabled as possible, trying to
        emulate a C89 compiler on a POSIX system.
      
      - clang with ASan and UBSan.
      
      The Python tests don't set an exit code, so Travis won't detect
      failures. The same goes for "make tests", but we only run "make check"
      anyway.
      a2b53178
    • Nick Wellnhofer's avatar
      83212ff4
    • Nick Wellnhofer's avatar
      Avoid reparsing in xmlParseStartTag2 · 855c19ef
      Nick Wellnhofer authored
      The code in xmlParseStartTag2 must handle the case that the input
      buffer was grown and reallocated which can invalidate pointers to
      attribute values. Before, this was handled by detecting changes of
      the input buffer "base" pointer and, in case of a change, jumping
      back to the beginning of the function and reparsing the start tag.
      
      The major problem of this approach is that whether an input buffer is
      reallocated is nondeterministic, resulting in seemingly random test
      failures. See the mailing list thread "runtest mystery bug: name2.xml
      error case regression test" from 2012, for example.
      
      If a reallocation was detected, the code also made no attempts to
      continue parsing in case of errors which makes a difference in
      the lax "recover" mode.
      
      Now we store the current input buffer "base" pointer for each (not
      separately allocated) attribute in the namespace URI field, which isn't
      used until later. After the whole start tag was parsed, the pointers
      to the attribute values are reconstructed using the offset between the
      new and the old input buffer. This relies on arithmetic on dangling
      pointers which is technically undefined behavior. But it seems like
      the easiest and most efficient fix and a similar approach is used in
      xmlParserInputGrow.
      
      This changes the error output of several tests, typically making it
      more verbose because we try harder to continue parsing in case of
      errors.
      
      (Another possible solution is to check not only the "base" pointer
      but the size of the input buffer as well. But this would result in
      even more reparsing.)
      855c19ef
    • Nick Wellnhofer's avatar
      Simplify control flow in xmlParseStartTag2 · 07b7428b
      Nick Wellnhofer authored
      Remove some goto labels and deduplicate a bit of code after handling
      namespaces.
      
      Before:
      
          loop {
              parseAttribute
              if (ok) {
                  if (defaultNamespace) {
                      handleDefaultNamespace
                      if (error)
                          goto skip_default_ns;
                      handleDefaultNamespace
          skip_default_ns:
                      freeAttr
                      nextAttr
                      continue;
                  }
                  if (namespace) {
                      handleNamespace
                      if (error)
                          goto skip_ns;
                      handleNamespace
          skip_ns:
                      freeAttr
                      nextAttr;
                      continue;
                  }
                  handleAttr
              } else {
                  freeAttr
              }
              nextAttr
          }
      
      After:
      
          loop {
              parseAttribute
              if (!ok)
                  goto next_attr;
              if (defaultNamespace) {
                  handleDefaultNamespace
                  if (error)
                      goto next_attr;
                  handleDefaultNamespace
              } else if (namespace) {
                  handleNamespace
                  if (error)
                      goto next_attr;
                  handleNamespace
              } else {
                  handleAttr
              }
          next_attr:
              freeAttr
              nextAttr
          }
      07b7428b
    • Nick Wellnhofer's avatar
      Disable LeakSanitizer when running API tests · ac9a4560
      Nick Wellnhofer authored
      The autogenerated API tests leak memory.
      ac9a4560
    • Nick Wellnhofer's avatar
      Avoid out-of-bound array access in API tests · ff34ba3e
      Nick Wellnhofer authored
      The API tests combine string buffers with arbitrary length values which
      makes ASan detect out-of-bound array accesses. Even without ASan, this
      could lead to unwanted test failures.
      
      Add a check for "len", "size", and "start" arguments, assuming they
      apply to the nearest char pointer. Skip the test if they exceed the
      buffer size. This is a somewhat naive heuristic but it seems to work
      well.
      ff34ba3e
    • Nick Wellnhofer's avatar
      Fix undefined behavior in xmlRegExecPushStringInternal · 34e44567
      Nick Wellnhofer authored
      It's stupid, but the behavior of memcpy(NULL, NULL, 0) is undefined.
      34e44567
    • Nick Wellnhofer's avatar
      Avoid spurious UBSan errors in parser.c · 47496724
      Nick Wellnhofer authored
      If available, use a C99 flexible array member to avoid spurious UBSan
      errors.
      47496724
  10. 31 May, 2017 2 commits