`&` Entity discarded when using XML_PARSE_RECOVER
If an element has a duplicate attribute all occurrences of &
in all its children will be removed.
For instance, this document
<html foo='x' foo='x'><body><p>A & B</p></body></html>
is turned into
<html foo='x' foo='x'><body><p>A B</p></body></html>
(Originally reported as a Nokogiri issue: https://github.com/sparklemotion/nokogiri/issues/2267)
Reproduction test case
#include <string.h>
#include <libxml/parser.h>
int main(int argc, char **argv) {
xmlDocPtr doc;
xmlChar *xmlbuff;
int buffersize;
const char *h = "<html foo='x' foo='x'><body>A & B</body></html>";
const int size = strlen(h);
const int options = XML_PARSE_RECOVER;
doc = xmlReadMemory (h, size, NULL, NULL, options);
xmlDocDumpFormatMemory(doc, &xmlbuff, &buffersize, 1);
printf("%s", (char *) xmlbuff);
return(0);
}
Versions affected
At least all versions between 2.9.3 an 2.9.12.