Commit 6cf5160d authored by Joanmarie Diggs's avatar Joanmarie Diggs

Web: Fix performance issue resulting from detecting offscreen text brokenness

When authors hide text offscreen so that only screen readers will find
them and present them, they think they are being helpful. Unfortunately,
their techniques by side effect can break what we get for the accessible
text (e.g. asking for a line at offset results in only a single char or
word). Thus we have to sanity check all text in order to work around
this. Normally this is not a performance problem because we can bail
after checking the first line. But in a giant text object whose contents
consist almost entirely of embedded object chars, we can get quite laggy.
Therefore, if the accessible text is more than 30% embedded object chars,
bail on the lines-are-single-words sanity check.
parent 53b04673
......@@ -3029,6 +3029,16 @@ class Utilities(script_utilities.Utilities):
if not nChars:
return False
# If we have a series of embedded object characters, there's a reasonable chance
# they'll look like the one-word-per-line CSSified text we're trying to detect.
# We don't want that false positive. By the same token, the one-word-per-line
# CSSified text we're trying to detect can have embedded object characters. So
# if we have more than 30% EOCs, don't use this workaround. (The 30% is based on
# testing with problematic text.)
eocs = re.findall(self.EMBEDDED_OBJECT_CHARACTER, text.getText(0, -1))
if len(eocs)/nChars > 0.3:
return False
try:
obj.clearCache()
state = obj.getState()
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment