On Thu 14-03-24 15:47:52, Christian Brauner wrote: > On Thu, Mar 14, 2024 at 12:10:59PM +0100, Christian Brauner wrote: > > On Tue, Mar 12, 2024 at 07:32:35PM -0700, Christoph Hellwig wrote: > > > Now that this is in mainline it seems to cause blktests to crash > > > nbd/003 with a rather non-obvious oops for me: > > > > Ok, will be looking into that next. > > Ok, I know what's going on. Basically, fput() on the block device runs > asynchronously which means that bdev->bd_holder can still be set to @sb > after it has already been freed. Let me illustrate what I mean: > > P1 P2 > mount(sb) fd = open("/dev/nbd", ...) > -> file = bdev_file_open_by_dev(..., sb, ...) > bdev->bd_holder = sb; > > // Later on: > > umount(sb) > ->kill_block_super(sb) > |----> [fput() -> deferred via task work] > -> put_super(sb) -> frees the sb via rcu > | > | nbd_ioctl(NBD_CLEAR_SOCK) > | -> disk_force_media_change() > | -> bdev_mark_dead() > | -> fs_mark_dead() > | // Finds bdev->bd_holder == sb > |-> file->release::bdev_release() // Tries to get reference to it but it's freed; frees it again > bdev->bd_holder = NULL; > > Two solutions that come to my mind: > > [1] Indicate to fput() that this is an internal block devices open and > thus just close it synchronously. This is fine. First, because the block > device superblock is never unmounted or anything so there's no risk > from hanging there for any reason. Second, bdev_release() also ran > synchronously so any issue that we'd see from calling > file->f_op->release::bdev_release() we would have seen from > bdev_release() itself. See [1.1] for a patch. > > (2) Take a temporary reference to the holder when opening the block > device. This is also fine afaict because we differentiate between > passive and active references on superblocks and what we already do > in fs_bdev_mark_dead() and friends. This mean we don't have to mess > around with fput(). See [1.2] for a patch. > > (3) Revert and cry. No patch. > > Personally, I think (2) is more desirable because we don't lose the > async property of fput() and we don't need to have a new FMODE_* flag. > I'd like to do some more testing with this. Thoughts? Yeah, 2) looks like the least painful solution to me as well. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR