g_file_set_contents() can fill target with NUL bytes if it did not previously exist
Submitted by Will Thompson
Created attachment 364082 overengineered reproducer for this bug
Conventional wisdom has it that after a successful call to
g_file_set_contents(filename, "foo", 3, ...)
the file at 'filename' will contain either "foo" or its previous contents, even after an unclean unmount of the filesystem in question. In particular, I would expect that if 'filename' did not previously exist, it will either contain "foo" or not exist. Similarly, if 'filename' was previously empty, I would expect it to either contain "foo" or still be empty.
In both these cases, however, it can happen that 'filename' is the correct length (3 bytes in this case) but contains only NUL bytes. Simon Schampijer originally observed this symptom on Endless OS in a virtual machine. One of our daemons writes some keyfiles on first launch with g_key_file_save_to_file(), which uses g_file_set_contents(), and those files were the expected length but full of NULs. The filesystem in question is ext4 with not-too-unusual creation flags and default mount flags.
I can reproduce this artificially on a Fedora 27 system by mapping an ext4 image (created with
mkfs.ext4 and no flags) to a device with 'dmsetup', mounting it (with default flags), then using 'dmsetup remove --force' to remove it while it is still mounted. (Actually, it's replaced by a dm target which causes all reads or writes to fail; close enough.) Test script attached. It's a bit over-engineered but I happened to have most of the scaffolding lying around from another test suite, and I couldn't think of another way to force an unclean unmount without using real hardware.
The documentation does not quite spell out the "old or new" guarantee I expected, but it does use the word "atomic":
Writes all of contents to a file named filename , with good error checking. If a file called filename already exists it will be overwritten.
This write is atomic in the sense that it is first written to a temporary file which is then renamed to the final name. [...]
and I think it's reasonable to expect that the target file will not contain incorrect contents in this way. If not, I think this edge case should be called out in the documentation. It was relatively simple to handle the "file is full o' NULs" case in the Endless OS component where we saw this problem, but that might not be true in genral.
I think this behaviour is because fsync() is skipped when the target file is nonexistent or empty: https://git.gnome.org/browse/glib/tree/glib/gfileutils.c?id=6bcc8b40349055ace526ccfb25f58b6322c82d64#n1117
/* If the final destination exists and is > 0 bytes, we want to sync the * newly written file to ensure the data is on disk when we rename over * the destination. Otherwise if we get a system crash we can lose both * the new and the old file on some filesystems. (I.E. those that don't * guarantee the data is written to the disk before the metadata.) */ if (g_lstat (dest_file, &statbuf) == 0 && statbuf.st_size > 0 && fsync (fd) != 0)
The fsync() has always been skipped when the target file does not exist https://git.gnome.org/browse/glib/commit/?id=6cff88ba18b3bc0d118308f109840cb163dcea03 and the "non-zero size" check was added shortly after the fsync() was introduced to improve the performance of trashing files: https://git.gnome.org/browse/glib/commit/?id=d20a188b1250ab3cf211d684429127d99378e886.
The [posix_]fallocate() was added several years later: https://git.gnome.org/browse/glib/commit/glib/gfileutils.c?id=d3be43fcc5165b7680c9073438ad60a3652c1703 so I initially thought this might be a regression introduced by that. But no: if I patch it out, in the "file did not previously exist" case, an empty file is present at the target path after a crash. This means that the "existed-but-was-empty" case behaves as expected, but only by accident :-)
Attachment 364082, "overengineered reproducer for this bug":