Re: [RFC v4 linux-next 19/19] fs & block: remove bdev->bd_inode

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2024/03/22 14:33, Al Viro 写道:
On Tue, Mar 19, 2024 at 04:26:19PM +0800, Yu Kuai wrote:

+void put_bdev_file(struct block_device *bdev)
+{
+       struct file *file = NULL;
+       struct inode *bd_inode = bdev_inode(bdev);
+
+       mutex_lock(&bdev->bd_disk->open_mutex);
+       file = bd_inode->i_private;
+
+       if (!atomic_read(&bdev->bd_openers))
+               bd_inode->i_private = NULL;
+
+       mutex_unlock(&bdev->bd_disk->open_mutex);
+
+       fput(file);
+}

Locking is completely wrong here.  The only thing that protects
->bd_openers is ->open_mutex.  atomic_read() is obviously a red
herring.

I'm lost here, in get_bdev_file() and put_bdev_file(), I grabbed
'open_mutex' to protect reading 'bd_openers', reading and setting
'bd_inode->i_private'.

Suppose another thread has already opened the same sucker
with bdev_file_open_by_dev().

Now you are doing the same thing, just as the other guy is
getting to bdev_release() call.

The thing is, between your get_bdev_file() and increment of ->bd_openers
(in bdev_open()) there's a window when bdev_release() of the old file
could've gotten all the way through the decrement of ->bd_openers
(to 0, since our increment has not happened yet) and through the
call of put_bdev_file(), which ends up clearing ->i_private.

End result:

* old ->i_private leaked (already grabbed by your get_bdev_file())
* ->bd_openers at 1 (after your bdev_open() gets through)
* ->i_private left NULL.

Yes, I got you now. The problem is this patch is that:

1) opener 1, set bdev_file, bd_openers is 1
2) opener 2, before bdev_open(), get bdev_file,
3) close 1, bd_openers is 0, clear bdev_file
4) opener 2, after bdev_open(), bdev_file is cleared unexpected.

Christoph, could we please get rid of that atomic_t nonsense?
It only confuses people into brainos like that.  It really
needs ->open_mutex for any kind of atomicity.

While we're here, which way should we move forward?
1. keep the behavior to use bdev for iomap/buffer_head for raw block
ops;
2. record new 'bdev_file' in 'bd_inode->i_private', and use a new way
to handle the concurrent scenario.
3. other possible solution?

Thanks,
Kuai


.






[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux