Re: [PATCH 10/11] fs: fix hungtask due to repeated traversal of inodes list

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 04-12-24 12:17:49, Christian Brauner wrote:
> On Mon, Nov 18, 2024 at 07:45:07PM +0800, Ye Bin wrote:
> > From: Ye Bin <yebin10@xxxxxxxxxx>
> > 
> > There's a issue when remove scsi disk, the invalidate_inodes() function
> > cannot exit for a long time, then trigger hungtask:
> > INFO: task kworker/56:0:1391396 blocked for more than 122 seconds.
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Workqueue: events_freezable virtscsi_handle_event [virtio_scsi]
> > Call Trace:
> >  __schedule+0x33c/0x7f0
> >  schedule+0x46/0xb0
> >  schedule_preempt_disabled+0xa/0x10
> >  __mutex_lock.constprop.0+0x22b/0x490
> >  mutex_lock+0x52/0x70
> >  scsi_scan_target+0x6d/0xf0
> >  virtscsi_handle_event+0x152/0x1a0 [virtio_scsi]
> >  process_one_work+0x1b2/0x350
> >  worker_thread+0x49/0x310
> >  kthread+0xfb/0x140
> >  ret_from_fork+0x1f/0x30
> > 
> > PID: 540499  TASK: ffff9b15e504c080  CPU: 44  COMMAND: "kworker/44:0"
> > Call trace:
> >  invalidate_inodes at ffffffff8f3b4784
> >  __invalidate_device at ffffffff8f3dfea3
> >  invalidate_partition at ffffffff8f526b49
> >  del_gendisk at ffffffff8f5280fb
> >  sd_remove at ffffffffc0186455 [sd_mod]
> >  __device_release_driver at ffffffff8f738ab2
> >  device_release_driver at ffffffff8f738bc4
> >  bus_remove_device at ffffffff8f737f66
> >  device_del at ffffffff8f73341b
> >  __scsi_remove_device at ffffffff8f780340
> >  scsi_remove_device at ffffffff8f7803a2
> >  virtscsi_handle_event at ffffffffc017204f [virtio_scsi]
> >  process_one_work at ffffffff8f1041f2
> >  worker_thread at ffffffff8f104789
> >  kthread at ffffffff8f109abb
> >  ret_from_fork at ffffffff8f001d6f
> > 
> > As commit 04646aebd30b ("fs: avoid softlockups in s_inodes iterators")
> > introduces the retry logic. In the problem environment, the 'i_count'
> > of millions of files is not zero. As a result, the time slice for each
> > traversal to the matching inode process is almost used up, and then the
> > traversal is started from scratch. The worst-case scenario is that only
> > one inode can be processed after each wakeup. Because this process holds
> > a lock, other processes will be stuck for a long time, causing a series
> > of problems.
> > To solve the problem of repeated traversal from the beginning, each time
> > the CPU needs to be freed, a cursor is inserted into the linked list, and
> > the traversal continues from the cursor next time.
> > 
> > Fixes: 04646aebd30b ("fs: avoid softlockups in s_inodes iterators")
> > Signed-off-by: Ye Bin <yebin10@xxxxxxxxxx>
> > ---
> >  fs/inode.c | 22 ++++++++++++++++++++--
> >  1 file changed, 20 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/inode.c b/fs/inode.c
> > index dc966990bda6..b78895af8779 100644
> > --- a/fs/inode.c
> > +++ b/fs/inode.c
> > @@ -857,11 +857,16 @@ static void dispose_list(struct list_head *head)
> >  void evict_inodes(struct super_block *sb)
> >  {
> >  	struct inode *inode, *next;
> > +	struct inode cursor;
> 
> It seems pretty adventurous to me to just add in a random inode whose
> only fiels that is initialized is i_state. That would need a proper
> analysis and argument that this is safe to do and won't cause trouble
> for any filesystem.
> 
> Jan, do you have thoughts on this?

Yeah, I think in the current state where there are several instances of
hand-crafted inode iteration code it is somewhat fragile to use the cursor
approach. I was staying silent because I was hoping Dave Chinner's patches
to clean up inode iteration get to a more ready state. Then either we have
well consolidated inode iteration code so additions like this can be easily
verified for correctness or we could even get as far as removing
sb->s_inodes list altogether as Dave outlined [1] which would nicely deal
with the issue solved here as well. Sadly that patch series seems to have
lost traction. Hopefully we can either revive it or at least scavenge the
nice preparatory cleanups...

								Honza

[1] https://lore.kernel.org/all/ZwRvshM65rxXTwxd@xxxxxxxxxxxxxxxxxxx
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux