Re: [PATCH] the dm-loop target

Dave Chinner <david@xxxxxxxxxxxxx> · Tue, 4 Mar 2025 13:13:57 +1100

On Mon, Mar 03, 2025 at 10:03:42PM +0100, Mikulas Patocka wrote:
> 
> 
> On Mon, 3 Mar 2025, Christoph Hellwig wrote:
> 
> > On Mon, Mar 03, 2025 at 05:16:48PM +0100, Mikulas Patocka wrote:
> > > What should I use instead of bmap? Is fiemap exported for use in the 
> > > kernel?
> > 
> > You can't do an ahead of time mapping.  It's a broken concept.
> 
> Swapfile does ahead of time mapping. And I just looked at what swapfile 
> does and copied the logic into dm-loop. If swapfile is not broken, how 
> could dm-loop be broken?

Swap files cannot be accessed/modified by user code once the
swapfile is activated.  See all the IS_SWAPFILE() checked throughout
the VFS and filesystem code.

Swap files must be fully allocated (i.e. not sparse), nor contan
shared extents. This is required so that writes to the swapfile do
not require block allocation which would change the mapping...

Hence we explicitly prevent modification of the underlying file
mapping once a swapfile is owned and mapped by the kernel as a
swapfile.

That's not how loop devices/image files work - we actually rely on
them being:

a) sparse; and
b) the mapping being mutable via direct access to the loop file
whilst there is an active mounted filesystem on that loop file.

and so every IO needs to be mapped through the filesystem at
submission time.

The reason for a) is obvious: we don't need to allocate space for
the filesystem so it's effectively thin provisioned. Also, fstrim on
the mounted loop device can punch out unused space in the mounted
filesytsem.

The reason for b) is less obvious: snapshots via file cloning,
deduplication via extent sharing.

The clone operaiton is an atomic modification of the underlying file
mapping, which then triggers COW on future writes to those mappings,
which causes the mapping to the change at write IO time.

IOWs, the whole concept that there is a "static mapping" for a loop
device image file for the life of the image file is fundamentally
flawed.

> > > Dm-loop is significantly faster than the regular loop:
> > > 
> > > # modprobe brd rd_size=1048576
> > > # dd if=/dev/zero of=/dev/ram0 bs=1048576
> > > # mkfs.ext4 /dev/ram0
> > > # mount -t ext4 /dev/ram0 /mnt/test
> > > # dd if=/dev/zero of=/mnt/test/test bs=1048576 count=512

Urk. ram disks are terrible for IO benchmarking. The IO is
synchronous (i.e. always completes in the submitter context) and
performance is -always CPU bound- due to the requirement for all
data copying to be marshalled through the CPU.

Please benchmark performance on NVMe SSDs - it will give a much more
accurate deomonstration of the performance differences we'll see in
real world usage of loop device functionality...

-Dave.
-- 
Dave Chinner
david@xxxxxxxxxxxxx