Fix space collapsing in block elements
This fixes this issue and improves performance:
- It ensures that spaces are also collapsed inside the supported block elements such as
<h1>
,<h2>
, etc.,<p>
,<blockquote>
,<li>
by adjustingget_text_content
. - It reimplements
replace_html_whitespace
by an efficient method that already collapses whitespace while it is replacing. - It introduces the
join_collapse_spaces
function that is likecollapse_spaces
but just for a single string join. - It uses this
join_collapse_spaces
to only remove whitespace while joining text blocks. This way it is not continuously callingcollapse_spaces
on an ever growing string so it is now O(n) instead of O(n²) in the number of text blocks to join. - It expands the newline tests by adding tests on strings with newlines in (nested) block elements.
- It fixes linebreaks not being replaced when getting text content (mostly to get inline text of block elements).
Edited by Paul van Tilburg