Generating bmap information from images
Nov 24, 2015When dealing with disk images, the bmap-tool
project
is a godsend for writing said images to a physical device. Relying on an
accompanying bmap
file, bmaptool
skips unused data when copying an image.
Sadly, only few image providers bundle their images with the necessary bmap
information. Even though using bmaptool
with plain images still pays off
(writing to devices is notably faster than a regular dd
), most of its value is
lost.
bmaptool create
Although bmaptool
features a create
option, that mode of operation is mostly
useless as it relies on the file sparseness to extract the unused blocks.
However, sparseness has completely different semantics! Quoting
Wikipedia:
When reading sparse files, the file system transparently converts metadata representing empty blocks into “real” blocks filled with zero bytes at runtime.
When creating a bmap
file based on image sparseness and subsequently writing
that image to a device, the previously sparse blocks will be skipped, resulting
in non-zero bytes when reading from those blocks.
bmaptool-scan
I’ve written a small script which scans for the actual unused blocks, i.e., based on the underlying file system instead of the image sparseness. These blocks are truly unused, which means we can safely skip writing them when copying an image to a device.
Usage
# modprobe loop
# bmaptool-scan --bmap something.img | tee something.bmap
Found 1 partition(s) in image
- processing 1.7G partition at 1.0M into the image
<?xml version="1.0"?>
<bmap version="2.0">
...
</bmap>
There’s two program modes available: --bmap
and --sparse
. The former
generates a bmap
file, as can be seen above. The --sparse
option punches
holes in the input image. This can subsequently be used to generate a bmap
file using bmaptool create
, or just to optimize storage of the image.
Note that you need to run the script as root, and you need quite some Perl packages. See the README for more details.
Limitations
As the script relies on parsing the image partition table and probing any partition found using file system specific tools, there are a few limitations:
- Implementation wise: adding more file systems. Currently only
ext4
is supported, which should be trivial to extend to the entireext
family. Adding support for other file systems means adding code to dump and parse the unallocated blocks within the partition. - Nonstandard partition layouts (hybrid images, overlapping partitions, …): these might break the script, or might even be unsafe to write selectively. Use at your own risk!