Skip to content

Add GLBuffer implementation w/ persistent mapping

Benjamin Otte requested to merge wip/otte/gl-map-buffer into main

If glBufferStorage() is available, we can replace our usage of glBufferSubData() with persistently mapped storage via glMappedBufferRange().

This has 1 disadvantage:

  1. It's not supported everywhere, it requires GL 4.4 or GL_EXT_buffer_storage. But every GPU of the last 10 years should implement it. So we check for it and keep the old code. The old code can also be forced via GDK_GL_DISABLE=buffer-storage.

But it has 2 advantages:

  1. It is what Vulkan does, so it unifies the two renderers' buffer handling.

  2. It is a significant performance boost in use cases with large vertex buffers. Those are pretty rare, but do happen with lots of text at a small font size. An example would be a small font in a maximized VTE terminal or the overview in gnome-text-editor.

A custom benchmark tailored for this problem can be created with:

tests/rendernode-create-tests 1000000 text.node

This creates a node file called "text.node" that draws 1 million text nodes.
(Creating that test takes a minute or so. A smaller number may be useful on less powerful hardware than my Intel Tigerlake laptop.)
The difference can then be compared via:

tools/gtk4-rendernode-tool benchmark --runs=20 text.node

and

GDK_GL_DISABLE=buffer-storage tools/gtk4-rendernode-tool benchmark --runs=20 text.node

Here's a few benchmark numbers from my machines:

computer size GL before GL after Vulkan before Vulkan after
TigerLake 1M 1.1s 0.8s 1.0s 1.0s
Radeon RX6550XT 1M 1.6s 0.7s 2.5s 0.9s
Radeon RX6950XT 1M 0.36s 0.3s 1.55s 0.6s
Radeon integrated 1M 1.7s 1.2s 2.8s 1.1s
RPi 4 100k 2.0s 1.9s

And here's the difference in a flamegraph (top is after, bottom is before):

image

Related: !7021 (closed)

Edited by Benjamin Otte

Merge request reports