Subclassing GExiv2.Metadata in Python eats exceptions from get_tag_multiple(), causing CPython to crash
In the Variety program, we use GExiv2 to keep track of metadata when we download pictures from remote sources. In https://github.com/varietywalls/variety/issues/152, there have been multiple reports of free(): invalid pointer
crashes, and Python's faulthandler seems to point to a get_tag_multiple()
call as the reason.
In particular, we use a subclass of GExiv2.Metadata that exposes key-value indexing of metadata fields: https://github.com/varietywalls/variety/blob/master/variety/Util.py#L152-L188. At first glance, I don't see anything strange in that bit of Python code (and it seems subclassing is generally supported by PyGObject).
I've written some test cases using images provided by the reporters: 02263_flash_1920x1080.jpg.zip, 200717040711-1867.jpg. The result: when reading the image using GExiv2.Metadata, an exception is raised. Reading it with the VarietyMetadata subclass however, crashes the Python interpreter!
On my system, I am using exiv2 0.25-4 and gexiv2 0.26-1, both from the Debian testing repositories.
I used this test case for VarietyMetadata, called vrty-crash.py
:
#!/usr/bin/env python3
import sys
from variety.Util import VarietyMetadata
try:
path = sys.argv[1]
except IndexError:
print('Error: need 1 argument: path to file', file=sys.stderr)
sys.exit(1)
x = VarietyMetadata(path)
print(x)
print(x.get_tag_multiple('Iptc.Application2.Keywords'))
and I get the following results:
$ python3 vrty-crash.py 02263_flash_1920x1080.jpg
<Util.VarietyMetadata object at 0x7f14cb76c120 (variety+Util+VarietyMetadata at 0xe1da20)>
free(): invalid pointer
Aborted
Using GExiv2.Metadata, I get a proper exception instead:
#!/usr/bin/env python3
import sys
from gi.repository import GExiv2
try:
path = sys.argv[1]
except IndexError:
print('Error: need 1 argument: path to file', file=sys.stderr)
sys.exit(1)
x = GExiv2.Metadata(path)
print(x)
print(x.get_tag_multiple('Iptc.Application2.Keywords'))
$ python3 vrty-crash2.py 02263_flash_1920x1080.jpg
vrty-crash2.py:4: PyGIWarning: GExiv2 was imported without specifying a version first. Use gi.require_version('GExiv2', '0.10') before import to ensure that the right version gets loaded.
from gi.repository import GExiv2
<GExiv2.Metadata object at 0x7fa548915ea0 (GExiv2Metadata at 0x1a3ba80)>
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9a in position 1: invalid start byte
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "vrty-crash2.py", line 14, in <module>
print(x.get_tag_multiple('Iptc.Application2.Keywords'))
SystemError: <class 'gobject.Warning'> returned a result with an error set