Test failures were previously ignored on macOS because there are 12 tests which consistently fail (and have not yet been fixed, because there are no regularly active macOS maintainers for GLib; you could help here!).
However, this means that new test failures can’t be spotted.
So, explicitly mark those 12 tests as
should_fail on macOS, and then
make other test failures cause failure of the CI run.
Signed-off-by: Philip Withnall email@example.com