Zero out unused space when creating disk images
When creating disk images, unused space (i.e. not part of a partition, or part of a partition but not occupied by files) presumably is copied into the image as-is. This would be a direct consequence of the way the image format works – it is simply a byte-by-byte copy of the disk, with no concept of partitions or filesystems.
This can cause privacy issues when a disk which has previously held confidential data gets repurposed for something else and an image is created from that disk – the image will hold any data that has previously been on the disk and not overwritten. Sure, the initial security issue is not properly destroying confidential data, but imaging may exacerbate the consequences if the image is intended for redistribution.
Secondly, because the unused sectors hold somewhat random data, the image file will not compress very efficiently, and compresses image files will take up more space than they need to.
An approach to improve this would be to plug into the imaging mechanism and zero out “unused” bytes – i.e. anything that is not part of a partition, or anything that is part of a file system but not holding any data (unused blocks, unused parts of blocks, possibly unused parts of filesystem data structures). This would prevent unintended data leaks, as well as making compression more effective. Also, because unused bytes no longer need to be read from disk, reading a disk for imaging will potentially be faster, as the main driver is not disk capacity, but used disk space (note that the image still needs to be written, thus performance of the storage location may reduce or eliminate any benefit from this).
There are two potential obstacles to this:
- Altering data may be undesirable in some use cases (e.g. when creating a disk image for forensic purposes). There is probably no one-size-fits-all solution to this, but if the functionality is there, making this behavior configurable would be an option.
- Figuring out which byes on the disk hold meaningful data and which ones do not requires the tool to understand partition tables and filesystems. If gnome-disk-utility simply runs
dd
or similar as a black box, this may be difficult. However, if gnome-disk-utility implementsdd
functionality by itself and has some partition/filesystem logic, it should be possible to replace certain parts of the data stream with zero bytes. The quickest win is probably to zero out disk space not allocated to any partition. Then filesystem-specific logic could be implemented, starting with the most common filesystems.