Xpath queries > 18k characters yield a unknown error in lxml versions >= 4.8.0
Xpath Queries longer than app. 18000 characters/bytes yield an unknown error at least in lxml version 4.8.0 and higher. In earlier versions (at least 4.5.2 and lower) this error does not occur.
Python : sys.version_info(major=3, minor=10, micro=8, releaselevel='final', serial=0) lxml.etree : (4, 9, 1, 0) libxml used : (2, 9, 12) libxml compiled : (2, 9, 12) libxslt used : (1, 1, 34) libxslt compiled : (1, 1, 34)
Traceback (most recent call last): File "D:\Dropbox\jodijk\myprograms\python\LoL\test_lxml.py", line 137, in main() File "D:\Dropbox\jodijk\myprograms\python\LoL\test_lxml.py", line 128, in main results = getresults(strees[stree], fullquery) File "D:\Dropbox\jodijk\myprograms\python\LoL\test_lxml.py", line 29, in getresults noderesults = stree.xpath(fullquery) File "src\lxml\etree.pyx", line 1599, in lxml.etree._Element.xpath File "src\lxml\xpath.pxi", line 305, in lxml.etree.XPathElementEvaluator.call File "src\lxml\xpath.pxi", line 225, in lxml.etree._XPathEvaluatorBase._handle_result lxml.etree.XPathEvalError: unknown error
The relevant Xpath queries are valid and work without any problem in other Xpath query evaluators and in earlier versions od lxml.
We use names for macros inside Xpath expressions. These are surrounded by %. We first expand these macros so that a full-fledged Xpath expression results. Below we give some data that suggest that a length > 18000 characters/bytes causes the problem
query length //node[%declarative% and %Tarsp_B_X_count% = 3] 18165 invalid query: unknown error //node[( %Tarsp_B_X_count% = 3)] 14938 ok //node[( %declarative% )] 3239 ok
I can send an Xpath query longer than 18k characters so that you can test yourself, but I do not know how to attach a file here.
Since this is a problem with Xpath, the lxml website suggest the problem might be with libxml2 rather than lxml, so that is why I submit it here.
I look forward to a response.