Re: [PATCH v2 04/34] md: port block device access to file

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 15, 2024 at 04:53:42PM +0200, Christian Brauner wrote:
> On Mon, Apr 15, 2024 at 10:35:53PM +0800, Ming Lei wrote:
> > On Mon, Apr 15, 2024 at 02:35:17PM +0200, Christian Brauner wrote:
> > > On Mon, Apr 15, 2024 at 05:26:19PM +0800, Ming Lei wrote:
> > > > Hello,
> > > > 
> > > > On Tue, Jan 23, 2024 at 02:26:21PM +0100, Christian Brauner wrote:
> > > > > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx>
> > > > > ---
> > > > >  drivers/md/dm.c               | 23 +++++++++++++----------
> > > > >  drivers/md/md.c               | 12 ++++++------
> > > > >  drivers/md/md.h               |  2 +-
> > > > >  include/linux/device-mapper.h |  2 +-
> > > > >  4 files changed, 21 insertions(+), 18 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> > > > > index 8dcabf84d866..87de5b5682ad 100644
> > > > > --- a/drivers/md/dm.c
> > > > > +++ b/drivers/md/dm.c
> > > > 
> > > > ...
> > > > 
> > > > > @@ -775,7 +778,7 @@ static void close_table_device(struct table_device *td, struct mapped_device *md
> > > > >  {
> > > > >  	if (md->disk->slave_dir)
> > > > >  		bd_unlink_disk_holder(td->dm_dev.bdev, md->disk);
> > > > > -	bdev_release(td->dm_dev.bdev_handle);
> > > > > +	fput(td->dm_dev.bdev_file);
> > > > 
> > > > The above change caused regression on 'dmsetup remove_all'.
> > > > 
> > > > blkdev_release() is delayed because of fput(), so dm_lock_for_deletion
> > > > returns -EBUSY, then this dm disk is skipped in remove_all().
> > > > 
> > > > Force to mark DMF_DEFERRED_REMOVE might solve it, but need our device
> > > > mapper guys to check if it is safe.
> > > > 
> > > > Or other better solution?
> > > 
> > > Yeah, I think there is. You can just switch all fput() instances in
> > > device mapper to bdev_fput() which is mainline now. This will yield the
> > > device and make it able to be reclaimed. Should be as simple as the
> > > patch below. Could you test this and send a patch based on this (I'm on
> > > a prolonged vacation so I don't have time right now.):
> > 
> > Unfortunately it doesn't work.
> > 
> > Here the problem is that blkdev_release() is delayed, which changes
> > 'dmsetup remove_all' behavior, and causes that some of dm disks aren't
> > removed.
> > 
> > Please see dm_lock_for_deletion() and dm_blk_open()/dm_blk_close().
> 
> So you really need blkdev_release() itself to be synchronous? Groan, in

At least the current dm implementation relies on this way sort of, and
it could be addressed by forcing to mark DMF_DEFERRED_REMOVE in
remove_all().

> that case use __fput_sync() instead of fput() which ensures that this
> file is closed synchronously.

I tried __fput_sync(), but the following panic is caused:

[  113.486522] ------------[ cut here ]------------
[  113.486524] kernel BUG at fs/file_table.c:453!
[  113.486531] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
[  113.488878] CPU: 6 PID: 1919 Comm: dmsetup Kdump: loaded Not tainted 5.14.0+ #23
[  113.490114] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-1.fc37 04/01/2014
[  113.491661] RIP: 0010:__fput_sync+0x25/0x30
[  113.492562] Code: 90 90 90 90 90 0f 1f 44 00 00 f0 48 ff 4f 38 75 14 65 48 8b 04 25 40 25 03 00 f6 40 36 20 74 0a e9 20 fd ff ff c3 cc cc cc cc <0f0
[  113.493926] RSP: 0018:ffffb76581003c20 EFLAGS: 00010246
[  113.494220] RAX: ffff92eca6ef8000 RBX: ffff92ed176c3c18 RCX: 000000008080007c
[  113.494632] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffff92ec844cac00
[  113.495033] RBP: ffff92ed176c3c00 R08: 0000000000000001 R09: 0000000000000000
[  113.495378] R10: ffffb76581003b00 R11: ffffb76581003b68 R12: ffff92ec8fccec20
[  113.495723] R13: ffff92ec8431b400 R14: ffff92ec8431b508 R15: ffff92ec8fccec00
[  113.496108] FS:  00007f5be5638840(0000) GS:ffff92f0ebb80000(0000) knlGS:0000000000000000
[  113.496581] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  113.496907] CR2: 00007f5be54694b0 CR3: 0000000108e54003 CR4: 0000000000770ef0
[  113.497308] PKRU: 55555554
[  113.497469] Call Trace:
[  113.497613]  <TASK>
[  113.497741]  ? show_trace_log_lvl+0x1c4/0x2df
[  113.497997]  ? show_trace_log_lvl+0x1c4/0x2df
[  113.498251]  ? dm_put_table_device+0x64/0xd0 [dm_mod]
[  113.498553]  ? __die_body.cold+0x8/0xd
[  113.498768]  ? die+0x2b/0x50
[  113.498937]  ? do_trap+0xce/0x120
[  113.499129]  ? __fput_sync+0x25/0x30
[  113.499337]  ? do_error_trap+0x65/0x80
[  113.499577]  ? __fput_sync+0x25/0x30
[  113.499787]  ? exc_invalid_op+0x4e/0x70
[  113.500011]  ? __fput_sync+0x25/0x30
[  113.500239]  ? asm_exc_invalid_op+0x16/0x20
[  113.500842]  ? __fput_sync+0x25/0x30
[  113.501387]  dm_put_table_device+0x64/0xd0 [dm_mod]
[  113.502047]  dm_put_device+0x80/0x110 [dm_mod]
[  113.502650]  stripe_dtr+0x2f/0x50 [dm_mod]
[  113.503218]  dm_table_destroy+0x59/0x120 [dm_mod]
[  113.503842]  __dm_destroy+0x114/0x1e0 [dm_mod]
[  113.504402]  dm_hash_remove_all+0x63/0x160 [dm_mod]
[  113.505028]  remove_all+0x1e/0x30 [dm_mod]
[  113.505602]  ctl_ioctl+0x19f/0x290 [dm_mod]
[  113.506146]  dm_ctl_ioctl+0xa/0x20 [dm_mod]
[  113.506717]  __x64_sys_ioctl+0x87/0xc0
[  113.507230]  do_syscall_64+0x5c/0xf0
[  113.507755]  ? exc_page_fault+0x62/0x150
[  113.508309]  entry_SYSCALL_64_after_hwframe+0x6e/0x76
[  113.508945] RIP: 0033:0x7f5be543ec6b



Thanks. 
Ming





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux