`xmlSchemaNewDocParserCtxt` + `xmlSchemaParse` can result in dangling pointers
Hi,
Using xmlSchemaNewDocParserCtxt
+ xmlSchemaParse
can result in dangling pointers. If the user creates a schema from a document, but holds a reference to a "whitespace" node in the document, xmlSchemaCleanupDoc
will free the whitespace node leaving the user with a dangling pointer.
Here is a test program that reproduces the issue:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <libxml/parser.h>
#include <libxml/xmlschemas.h>
static xmlNode *
find_text_node(xmlNode * a_node)
{
xmlNode *cur_node = NULL;
for (cur_node = a_node; cur_node; cur_node = cur_node->next) {
if (cur_node->type == XML_TEXT_NODE) {
return cur_node;
} else {
xmlNode *found = find_text_node(cur_node->children);
if (found) {
return found;
}
}
}
return NULL;
}
int main(int argc, char *argv[]) {
char * source = "<?xml version=\"1.0\" encoding=\"UTF-8\" ?><xs:schema xmlns:xs=\"http://www.w3.org/2001/XMLSchema\">\n<xs:element name=\"foo\" type=\"xs:string\"/></xs:schema>";
xmlDocPtr doc = xmlReadMemory(source, strlen(source), NULL, NULL, 0);
xmlNodePtr root = xmlDocGetRootElement(doc);
xmlSchemaParserCtxtPtr ctx;
xmlSchemaPtr schema;
xmlNode * text = find_text_node(root);
printf("node type: Element, name: %s %d\n", text->name, xmlIsBlankNode(text));
ctx = xmlSchemaNewDocParserCtxt(doc);
schema = xmlSchemaParse(ctx);
// `text` is now a dangling pointer
printf("node type: Element, name: %s %d\n", text->name, xmlIsBlankNode(text));
exit(0);
}
If I run this program with MallocScribbling (on MacOS), it will reliably segv:
$ env MallocScribble=1 ./a.out
a.out(78495,0x102687dc0) malloc: enabling scribbling to detect mods to free blocks
node type: Element, name: text 1
fish: 'env MallocScribble=1 ./a.out' terminated by signal SIGSEGV (Address boundary error)
I tried changing the code to not clean documents if preserve
is set on the context, but that made a bunch of tests fail. Specifically, this is the patch I tried:
diff --git a/xmlschemas.c b/xmlschemas.c
index 301c8449..315b85bb 100644
--- a/xmlschemas.c
+++ b/xmlschemas.c
@@ -10641,7 +10641,9 @@ doc_load:
/*
* Remove all the blank text nodes.
*/
- xmlSchemaCleanupDoc(pctxt, docElem);
+ if (!pctxt->preserve) {
+ xmlSchemaCleanupDoc(pctxt, docElem);
+ }
/*
* Check the schema's top level element.
*/