xmlXPathEvalExpression errors when certain Unicode code points are used in conjunction with XPath indexing syntax
When evaluating an XPath expression containing certain Unicode characters with xmlXPathEvalExpression
, an error will be thrown if XPath indexing syntax (i.e. [index]
) is used. However, no error is thrown when indexing is absent.
For example, using Unicode code point デ
U+30C7
(KATAKANA LETTER DE
):
This works: /デ
This errors: /デ[1]
Reproduction Code
#include <libxml/xpath.h>
int main(int argc, char* argv[]) {
xmlDocPtr doc;
xmlXPathContextPtr xpathContext;
xmlXPathObjectPtr xpathObj;
xmlInitParser();
const xmlChar* xpathExpressionGood = "/デ";
const xmlChar* xpathExpressionBad = "/デ[1]";
const xmlChar* filename = "unicode.xml";
doc = xmlParseFile(filename);
xpathContext = xmlXPathNewContext(doc);
// This line won't throw an error
xpathObj = xmlXPathEvalExpression(xpathExpressionGood, xpathContext);
// This line will throw an error.
xpathObj = xmlXPathEvalExpression(xpathExpressionBad, xpathContext);
xmlCleanupParser();
return 0;
}
Compiling and running the code above results in the following error:
XPath error : Invalid expression
/デ[1]
^
Operating System: Debian 10 Buster
libxml2 Version: 2.9.8
Edited by Kevin Gurney