pango_log2vis_get_embedding_levels with incomplete unicode sequences may write out of bounds
Not sure if this has been reported already, but it came up in a recent CTF:
The following crashes with a segfault on latest pango:
#include <pango/pango.h>
#include <pango/pango-bidi-type.h>
int main() {
PangoDirection dir = PANGO_DIRECTION_LTR;
char* inp = "\xf8";
printf("utf8 len %lu\n", g_utf8_strlen(inp, strlen(inp)));
pango_log2vis_get_embedding_levels(inp, strlen(inp), &dir);
return 0;
}
This happens because g_utf8_strlen("\xf8")
is zero, so n_chars
will be zero at this point: https://gitlab.gnome.org/GNOME/pango/blob/eb2c647ff693bf3218fd1772f11a008bfbc975e7/pango/pango-bidi-type.c#L173
But because length = 1
, the loop at https://gitlab.gnome.org/GNOME/pango/blob/eb2c647ff693bf3218fd1772f11a008bfbc975e7/pango/pango-bidi-type.c#L181 still executes at least one time, leading to a NULL pointer dereference (g_new(.., 0) = NULL)
).
In general, this issue leads to an out-of-bounds heap write and can be triggered via pango_itemize
if the bytes passed to pango_itemize
are user-controlled.