Re: [PATCH RFC 0/6] fs,block: yield devices

Christian Brauner <brauner@xxxxxxxxxx> · Wed, 25 Oct 2023 22:46:29 +0200

On Wed, Oct 25, 2023 at 07:20:57PM +0200, Jan Kara wrote:
> Hello!
> 
> On Tue 24-10-23 16:53:38, Christian Brauner wrote:
> > This is a mechanism that allows the holder of a block device to yield
> > device access before actually closing the block device.
> > 
> > If a someone yields a device then any concurrent opener claiming the
> > device exclusively with the same blk_holder_ops as the current owner can
> > wait for the device to be given up. Filesystems by default use
> > fs_holder_ps and so can wait on each other.
> > 
> > This mechanism allows us to simplify superblock handling quite a bit at
> > the expense of requiring filesystems to yield devices. A filesytems must
> > yield devices under s_umount. This allows costly work to be done outside
> > of s_umount.
> > 
> > There's nothing wrong with the way we currently do things but this does
> > allow us to simplify things and kills a whole class of theoretical UAF
> > when walking the superblock list.
> 
> I'm not sure why is it better to create new ->yield callback called under
> sb->s_umount rather than just move blkdev_put() calls back into
> ->put_super? Or at least yielding could be done in ->put_super instead of

The main reason was to not call potentially expensive
blkdev_put()/bdev_release() under s_umount. If we don't care about this
though then this shouldn't be a problem. And yes, then we need to move
blkdev_put()/bdev_release() under s_umount including the main block
device. IOW, we need to ensure that all bdev calls are done under
s_umount before we remove the superblock from the instance list. I think
that should be fine but I wanted to propose an alternative to that as
well: cheap mark-for-release under s_umount and heavy-duty without
s_umount. But I guess it doesn't matter because most filesystems did use
to close devices under s_umount before anyway. Let me know what you
think makes the most sense.