Hi,
在 2024/03/22 14:33, Al Viro 写道:
On Tue, Mar 19, 2024 at 04:26:19PM +0800, Yu Kuai wrote:
+void put_bdev_file(struct block_device *bdev)
+{
+ struct file *file = NULL;
+ struct inode *bd_inode = bdev_inode(bdev);
+
+ mutex_lock(&bdev->bd_disk->open_mutex);
+ file = bd_inode->i_private;
+
+ if (!atomic_read(&bdev->bd_openers))
+ bd_inode->i_private = NULL;
+
+ mutex_unlock(&bdev->bd_disk->open_mutex);
+
+ fput(file);
+}
Locking is completely wrong here. The only thing that protects
->bd_openers is ->open_mutex. atomic_read() is obviously a red
herring.
I'm lost here, in get_bdev_file() and put_bdev_file(), I grabbed
'open_mutex' to protect reading 'bd_openers', reading and setting
'bd_inode->i_private'.
Suppose another thread has already opened the same sucker
with bdev_file_open_by_dev().
Now you are doing the same thing, just as the other guy is
getting to bdev_release() call.
The thing is, between your get_bdev_file() and increment of ->bd_openers
(in bdev_open()) there's a window when bdev_release() of the old file
could've gotten all the way through the decrement of ->bd_openers
(to 0, since our increment has not happened yet) and through the
call of put_bdev_file(), which ends up clearing ->i_private.
End result:
* old ->i_private leaked (already grabbed by your get_bdev_file())
* ->bd_openers at 1 (after your bdev_open() gets through)
* ->i_private left NULL.
Yes, I got you now. The problem is this patch is that:
1) opener 1, set bdev_file, bd_openers is 1
2) opener 2, before bdev_open(), get bdev_file,
3) close 1, bd_openers is 0, clear bdev_file
4) opener 2, after bdev_open(), bdev_file is cleared unexpected.
Christoph, could we please get rid of that atomic_t nonsense?
It only confuses people into brainos like that. It really
needs ->open_mutex for any kind of atomicity.
While we're here, which way should we move forward?
1. keep the behavior to use bdev for iomap/buffer_head for raw block
ops;
2. record new 'bdev_file' in 'bd_inode->i_private', and use a new way
to handle the concurrent scenario.
3. other possible solution?
Thanks,
Kuai
.