2.79 regression: gdatetime test failing on 64-bit big-endian since #3119
I've started looking at packaging GLib 2.79.x in Debian experimental, so that we can make the packaging changes that are necessary for the GIRepository restructuring before it becomes time-critical to do so.
One issue I've encountered is a test failure in gdatetime
(glib/tests/gdatetime.c
) which seems to affect big-endian 64-bit architectures (s390x, together with the semi-official ppc64 and sparc64 ports).
Little-endian architectures seem to be unaffected, as are big-endian 32-bit architectures (the semi-official hppa and powerpc ports passed this test).
git bisect
says df4aea76 gdatetime: Add support for %E modifier to g_date_time_format()
(from #3119 (closed)) is the first bad commit.
Steps to reproduce
All packages except GLib itself are from Debian unstable (rolling release), on various Linux architectures, with glibc (currently version 2.37). The build logs linked below have full lists of the packages installed in the build chroot.
I built GLib with options similar to those we use for 2.78.x, and ran build-time tests as usual. The build log has the full invocation: please look for the meson setup
that is run in debian/build/deb
, and ignore the one for debian/build/udeb
(which is a cut-down build used in the installer).
It is relevant for this particular test that we generate lots of locales with https://salsa.debian.org/gnome-team/glib/-/blob/debian/latest/debian/tests/run-with-locales?ref_type=heads, in particular ja_JP.utf8
and ja_JP.EUC-JP
(search for run-with-locales
in the build log for the full list), so any tests that run conditionally if a certain locale is present will be run.
After reproducing the test failure on a developer-accessible machine, I tried repeating the test with meson test -C debian/build/deb gdatetime
, and also tried debugging it by copying the test command from the Meson log, and inserting gdb
before the executable name.
Expected result
Tests pass.
Actual result (1): test failure
In some test runs I get a test failure:
▶ 21/369 /GDateTime/eras/japan - GLib:ERROR:../../../glib/tests/gdatetime.c:2297:test_date_time_eras_japan: assertion failed (p_casefold == (o_casefold)): ("201904\346\234\21030\346\227\245 00\346\231\20200\345\210\20600\347\247\222" == "\345\271\263\346\210\22031\345\271\26404\346\234\21030\346\227\245 00\346\231\20200\345\210\20600\347\247\222") FAIL
which I believe decodes to:
- p_casefold =
201904月30日 00時00分00秒
- o_casefold =
平成31年04月30日 00時00分00秒
For example this happened on the unofficial sparc64 port, https://buildd.debian.org/status/fetch.php?pkg=glib2.0&arch=sparc64&ver=2.79.0%2Bgit20240110%7Eg38f5ba3c-1&stamp=1705086951&raw=0
When I re-run the test interactively on s390x, this happens a lot much more frequently than the other failure mode. I don't know why we saw the other failure mode in non-interactive official builds.
In particular, when I re-run the test under gdb, this is what I always(?) get.
Actual result (2): segfault or bus error
In other test runs I saw a segfault (SIGSEGV
), for example on the official s390x port (twice):
- https://buildd.debian.org/status/fetch.php?pkg=glib2.0&arch=s390x&ver=2.79.0%2Bgit20240110%7Eg38f5ba3c-1&stamp=1705083261&raw=0
- https://buildd.debian.org/status/fetch.php?pkg=glib2.0&arch=s390x&ver=2.79.0%2Bgit20240110%7Eg38f5ba3c-1&stamp=1705088035&raw=0
>>> UBSAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1:print_stacktrace=1 G_TEST_SRCDIR='/<<PKGBUILDDIR>>/glib/tests' MALLOC_PERTURB_=151 ASAN_OPTIONS=halt_on_error=1:abort_on_error=1:print_summary=1 G_ENABLE_DIAGNOSTIC=1 LD_LIBRARY_PATH='/<<PKGBUILDDIR>>/debian/build/deb/glib' G_DEBUG=gc-friendly MALLOC_CHECK_=2 G_TEST_BUILDDIR='/<<PKGBUILDDIR>>/debian/build/deb/glib/tests' '/<<PKGBUILDDIR>>/debian/build/deb/glib/tests/gdatetime'
▶ 21/369 /GDateTime/invalid OK
▶ 21/369 /GDateTime/add_days OK
▶ 21/369 /GDateTime/add_full OK
[... more successful tests]
▶ 21/369 /GDateTime/format_iso8601 OK
▶ 21/369 /GDateTime/strftime OK
21/369 glib:glib+core+slow / gdatetime ERROR 0.13s killed by signal 11 SIGSEGV
――――――――――――――――――――――――――――――――――――― ✀ ―――――――――――――――――――――――――――――――――――――
stderr:
(test program exited with status code -11)
On the unofficial ppc64 port I got a bus error (SIGBUS
) in about the same place.
I have been able to reproduce this on a developer-accessible s390x machine, but only rarely, and not interactively: when I try to run under gdb
, for whatever reason, I always get the assertion failure instead.