Commit 8badee0f authored by David Woodhouse's avatar David Woodhouse

Tell libxml to parse the document as ISO8859-1.

It isn't *really* ISO8859-1; we hope it'll be UTF-8. But it doesn't make any
difference to the XML parsing, which is all ASCII anyway. It only affects
the *content* of the data nodes... and in fact it doesn't matter for those
*either* because libxml doesn't attempt to do any translation; it just gives
us the strings.

The *only* difference that setting ISO8859-1 makes, as far as I know, is that
it stops libxml from aborting when it sees legacy 8-bit crap in the content.
Which *does* happen with broken mails, especially spam.
parent 6ad69cfe
......@@ -2126,10 +2126,14 @@ handle_server_response (SoupSession *session, SoupMessage *msg, gpointer data)
g_debug ("handle_server_response - performing xmlReadMemory");
// Otherwise proccess the server response
/* libxml doesn't really need to know what the charset is, as long as it's
ASCII-compatible so that the XML parsing can work. So tell it ISO8859-1
so that it will tolerate invalid UTF-8 sequences in node content, which
happen often in broken emails (especially spam). */
doc = xmlReadMemory ( (const char*) xml,
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment