Commit 1bb41479 authored by Andrés G. Aragoneses's avatar Andrés G. Aragoneses

Revert "Migo: Fixed parsing RSSs with BOM (bgo#727432)"

This reverts commit 27f1135e.

There were some problems with this patch:

a) When improving the version that Marcin posted in bugzilla,
in order to not hardcode UTF8 encoding, I incorrectly placed
this logic inside an if block like this:

 if (s.StartsWith("<?xml")) {

Given that the logic to remove the bom also asks for

 if (s.StartsWith(bom)) {

You would think that the logic added to fix this bug
would simply not work (because the condition that made
the code flow enter into the parent block would make
the second condition FALSE).

However, surprisingly enough, the bom from the podcast
example in bugzilla [0] is parsed as UTF8 and UTF8's
ByteOrderMark's length is 1, and Mono returns true for
both of the conditions above, even if bom is obviously
not "<" (the first character of the "<?xml" string to
check if the content is XML.

I still don't understand how can this be possible, but
the consequence of it is that the content is stripped
from its first character, so all XML feeds received
that started with "<?xml" were converted to a string
starting with "?xml", which clearly resulted in a
parsing error.

Not sure if this is a Mono bug, and not sure this is
fixed in newer versions, but with the version I'm using
now (3.2.8) I cannot reproduce bug 727432, and this is
a very widely used version of Mono (it comes with
Ubuntu 14.04 LTS, 14.10, and 15.04). So I'm reverting
the fix for this (until we figure out a better fix, and
the exact environment to reproduce it), otherwise many
more podcasts would be broken.

[0] http://podcast.dr.dk/p1/rssfeed/orientering.xml
parent 2880a6e0
......@@ -735,19 +735,12 @@ namespace Migo.Net
Match match = encoding_regexp.Match (s);
if (match.Success && match.Groups.Count > 0) {
string encodingStr = match.Groups[1].Value;
Encoding enc = Encoding;
try {
enc = Encoding.GetEncoding (encodingStr);
Encoding enc = Encoding.GetEncoding (encodingStr);
if (!enc.Equals (Encoding)) {
s = enc.GetString (resultPtr);
}
} catch (ArgumentException) {}
string bom = enc.GetString (enc.GetPreamble ());
if (s.StartsWith (bom)) {
s = s.Remove (0, bom.Length);
}
}
}
} catch (Exception ex) {
......
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment