Possible memory corruption with GLib.Regex.match
import gi
gi.require_version("GLib", "2.0")
from gi.repository import GLib
s = "hello"
r = GLib.Regex.new(s, 0, 0)
(success, match) = r.match(s, 0)
assert success, "should match"
try:
groups = match.fetch_all()
print(groups)
except UnicodeDecodeError as e:
print(e)
try:
groups = match.fetch_all()
print(groups)
except UnicodeDecodeError as e:
print(e)
Running this file on my machine prints a UnicodeDecodeError
on the first try
block, and then crashes with "double free or corruption (out)" on the second match.fetch_all()
call.
(venv)$ python3 main.py
'utf-8' codec can't decode byte 0x90 in position 0: invalid start byte
double free or corruption (out)
Aborted (core dumped)
Running the code in a REPL gives different results for the first fetch_all()
call (not always exactly reproducible).
(venv)$ python3
Python 3.10.12 (main, Jun 11 2023, 05:26:28) [GCC 11.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import gi
>>> gi.require_version("GLib", "2.0")
>>> from gi.repository import GLib
>>> s = "hello"
>>> r = GLib.Regex.new(s, 0, 0)
>>> (success, match) = r.match(s, 0)
>>> assert success
>>> groups = match.fetch_all()
>>> groups
['en_US']
>>> groups = match.fetch_all()
double free or corruption (out)
Aborted (core dumped)
I haven't looked into it, but my guess is that the returned MatchInfo
is referencing a string which has been freed.
string
is not copied and is used inGMatchInfo
internally. If you use anyGMatchInfo
method (exceptg_match_info_free()
) after freeing or modifying string then the behaviour is undefined.
(from https://docs.gtk.org/glib/method.Regex.match.html)
Versions:
(venv)$ pip list
Package Version
---------- -------
pip 22.0.2
pycairo 1.25.1
PyGObject 3.46.0
setuptools 59.6.0
Xubuntu 22.04 system, with libglib2.0-0
version 2.72.4-0ubuntu2.2
, running in a Python3 3.10.12 venv with PyGObject freshly installed.