GValue: Add support for interned string
In GStreamer we use GValue for storing the contents of GstStructure
(which is
essentially a named hash-table). And that structure is the main component of
GstCaps
, the capabilities of streams.
Ex: video/x-raw,format=(string)RGBA,width=(int)1920,height=(int)1080
describes
a 1080p RGB stream, where the individual fields content are stored as GValue. This would be stored essentially as:
-
video/x-raw
is stored as aGQuark
, and the contents are an array of:-
GQuark
for the field name -
GValue
for the field content
-
The performance problem we've encountered is when doing negotiation, where we will potentially do a lot of copies and comparison of the various fields (to see if they match/intersect). While for all the "regular" types (int, floats, ..) those copy/comparison are fast and cheap, for strings it's a completely different business.
- Copying is expensive : Copying a string GValue will do a
strdup()
- Comparing (equal or not) is expensive : We don't know whether the string in a
GValue
is canonical/interned or not, so we have to go throughstrcmp
Furthermore:
- We can't change the fact that we use
GValue
, it's essentially public API - We can't control the creation of those
GValue
(a user could create a string GValue and set it on a structure/caps field), so we can't assume string values are all interned/canonical.
Therefore, what I'm proposing is to add an API to glib to create string values from interned strings:
- They are guaranteed to exist for the duration of the process. Therefore when copying them (via g_value_copy), no memory duplication is required.
- This only requires storing a flag in the 2nd GValue data field, and therefore those interned strings are still 100% compatible with existing string GValue API.
- Their contents can be compared with simple pointer comparison (as opposed to
strcmp
).
In my local testing, this brings in a 5x-10x performance boost to caps negotiation for GStreamer (yes, we sadly have a lot of strings in fields ...)