Commit bf227135 authored by Joel Hockey's avatar Joel Hockey Committed by Nick Wellnhofer
Browse files

Validate UTF8 in xmlEncodeEntities

Code is currently assuming UTF-8 without validating. Truncated UTF-8
input can cause out-of-bounds array access.

Adds further checks to partial fix in 50f06b3e.

Fixes #178
parent 1358d157
Pipeline #276659 passed with stage
in 17 minutes and 18 seconds
...@@ -704,11 +704,25 @@ xmlEncodeEntitiesInternal(xmlDocPtr doc, const xmlChar *input, int attr) { ...@@ -704,11 +704,25 @@ xmlEncodeEntitiesInternal(xmlDocPtr doc, const xmlChar *input, int attr) {
} else { } else {
/* /*
* We assume we have UTF-8 input. * We assume we have UTF-8 input.
* It must match either:
* 110xxxxx 10xxxxxx
* 1110xxxx 10xxxxxx 10xxxxxx
* 11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
* That is:
* cur[0] is 11xxxxxx
* cur[1] is 10xxxxxx
* cur[2] is 10xxxxxx if cur[0] is 111xxxxx
* cur[3] is 10xxxxxx if cur[0] is 1111xxxx
* cur[0] is not 11111xxx
*/ */
char buf[11], *ptr; char buf[11], *ptr;
int val = 0, l = 1; int val = 0, l = 1;
if (*cur < 0xC0) { if (((cur[0] & 0xC0) != 0xC0) ||
((cur[1] & 0xC0) != 0x80) ||
(((cur[0] & 0xE0) == 0xE0) && ((cur[2] & 0xC0) != 0x80)) ||
(((cur[0] & 0xF0) == 0xF0) && ((cur[3] & 0xC0) != 0x80)) ||
(((cur[0] & 0xF8) == 0xF8))) {
xmlEntitiesErr(XML_CHECK_NOT_UTF8, xmlEntitiesErr(XML_CHECK_NOT_UTF8,
"xmlEncodeEntities: input not UTF-8"); "xmlEncodeEntities: input not UTF-8");
if (doc != NULL) if (doc != NULL)
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment