Skip to content
  • Nick Wellnhofer's avatar
    Avoid reparsing in xmlParseStartTag2 · 855c19ef
    Nick Wellnhofer authored
    The code in xmlParseStartTag2 must handle the case that the input
    buffer was grown and reallocated which can invalidate pointers to
    attribute values. Before, this was handled by detecting changes of
    the input buffer "base" pointer and, in case of a change, jumping
    back to the beginning of the function and reparsing the start tag.
    
    The major problem of this approach is that whether an input buffer is
    reallocated is nondeterministic, resulting in seemingly random test
    failures. See the mailing list thread "runtest mystery bug: name2.xml
    error case regression test" from 2012, for example.
    
    If a reallocation was detected, the code also made no attempts to
    continue parsing in case of errors which makes a difference in
    the lax "recover" mode.
    
    Now we store the current input buffer "base" pointer for each (not
    separately allocated) attribute in the namespace URI field, which isn't
    used until later. After the whole start tag was parsed, the pointers
    to the attribute values are reconstructed using the offset between the
    new and the old input buffer. This relies on arithmetic on dangling
    pointers which is technically undefined behavior. But it seems like
    the easiest and most efficient fix and a similar approach is used in
    xmlParserInputGrow.
    
    This changes the error output of several tests, typically making it
    more verbose because we try harder to continue parsing in case of
    errors.
    
    (Another possible solution is to check not only the "base" pointer
    but the size of the input buffer as well. But this would result in
    even more reparsing.)
    855c19ef