Skip to content

Fix space collapsing in block elements

This fixes this issue and improves performance:

  • It ensures that spaces are also collapsed inside the supported block elements such as <h1>, <h2>, etc., <p>, <blockquote>, <li> by adjusting get_text_content.
  • It reimplements replace_html_whitespace by an efficient method that already collapses whitespace while it is replacing.
  • It introduces the join_collapse_spaces function that is like collapse_spaces but just for a single string join.
  • It uses this join_collapse_spaces to only remove whitespace while joining text blocks. This way it is not continuously calling collapse_spaces on an ever growing string so it is now O(n) instead of O(n²) in the number of text blocks to join.
  • It expands the newline tests by adding tests on strings with newlines in (nested) block elements.
  • It fixes linebreaks not being replaced when getting text content (mostly to get inline text of block elements).
Edited by Paul van Tilburg

Merge request reports