On Wed, Oct 25, 2023 at 07:20:57PM +0200, Jan Kara wrote: > Hello! > > On Tue 24-10-23 16:53:38, Christian Brauner wrote: > > This is a mechanism that allows the holder of a block device to yield > > device access before actually closing the block device. > > > > If a someone yields a device then any concurrent opener claiming the > > device exclusively with the same blk_holder_ops as the current owner can > > wait for the device to be given up. Filesystems by default use > > fs_holder_ps and so can wait on each other. > > > > This mechanism allows us to simplify superblock handling quite a bit at > > the expense of requiring filesystems to yield devices. A filesytems must > > yield devices under s_umount. This allows costly work to be done outside > > of s_umount. > > > > There's nothing wrong with the way we currently do things but this does > > allow us to simplify things and kills a whole class of theoretical UAF > > when walking the superblock list. > > I'm not sure why is it better to create new ->yield callback called under > sb->s_umount rather than just move blkdev_put() calls back into > ->put_super? Or at least yielding could be done in ->put_super instead of The main reason was to not call potentially expensive blkdev_put()/bdev_release() under s_umount. If we don't care about this though then this shouldn't be a problem. And yes, then we need to move blkdev_put()/bdev_release() under s_umount including the main block device. IOW, we need to ensure that all bdev calls are done under s_umount before we remove the superblock from the instance list. I think that should be fine but I wanted to propose an alternative to that as well: cheap mark-for-release under s_umount and heavy-duty without s_umount. But I guess it doesn't matter because most filesystems did use to close devices under s_umount before anyway. Let me know what you think makes the most sense.