Box layout performance optimizations, part 0
Since commit 76c46739
"boxlayout: Fix broken min-size-for-opposite-size", GtkBoxLayout
runs a
binary search to determine its minimum size along the opposite
orientation, trying different sizes and seeing if its children fit the
provided for_size
constraint along the natural orientation. This is only
done if any of the box's children are not constant-size.
A notable source of non-constant-size-ness in a widget tree is a
GtkLabel
with wrap = true
, which runs a single Pango layout to measure
its height-for-width, and its own binary search, each time doing a Pango
layout, in case it has to measure its width-for-height. So if a
wrappable label is located in a horizontal box, there will be two levels
of binary searches performed. Moreover, the amount of nested binary
searches gets amplified if the label is wrapped into several layers of
boxes of alternating orientations, which causes a huge number of Pango
layouts to be performed.
This has been found to be a source of major performance issues in Fractal and Paper Plane chat apps.
Partly address the issue by noticing that the common case is actually the box having a single non-constant-size child, and a number of constant-size children; for example, this can be a wrappable label with the message text (height-for-width), sender's avatar (constant-size), and a non-wrappable label with the date/time the message was sent (also constant-size). This is also true of the many layers of outer boxes; where the inner box is the only non-constant-size child.
In this simple and common case, we can ask the single non-constant-size child for its minimum size in the opposite orientation directly, without running a binary search. This cuts down on the number of Pango layouts dramatically.
While this helps a lot with my sample test layouts, I have been unable to see how much this would help actual Fractal and Paper Plane, because Flatpak makes it way too difficult to run an app against a patched version of GTK. (If anyone has any tips on how one can do this with reasonable ease, please teach me!)
This (along with !6219 (merged)) is the most straightfoward, least intrusive optimization that should already have a large impact. I have some ideas for speeding things up further, some of which I have implemented, but they're more involved (new GtkSizeRequestCache
APIs...) and I would expect them to be a lot more controversial, so I'm not submitting those, at least for now.