Re: [PATCH V2] fs: avoid softlockups in s_inodes iterators

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed 16-10-19 10:26:16, Eric Sandeen wrote:
> On 10/16/19 9:39 AM, Eric Sandeen wrote:
> > On 10/16/19 8:49 AM, Jan Kara wrote:
> >> On Wed 16-10-19 08:23:51, Eric Sandeen wrote:
> >>> On 10/16/19 4:42 AM, Jan Kara wrote:
> >>>> On Tue 15-10-19 21:36:08, Eric Sandeen wrote:
> >>>>> On 10/15/19 2:37 AM, Jan Kara wrote:
> >>>>>> On Mon 14-10-19 16:30:24, Eric Sandeen wrote:
> >>>>>>> Anything that walks all inodes on sb->s_inodes list without rescheduling
> >>>>>>> risks softlockups.
> >>>>>>>
> >>>>>>> Previous efforts were made in 2 functions, see:
> >>>>>>>
> >>>>>>> c27d82f fs/drop_caches.c: avoid softlockups in drop_pagecache_sb()
> >>>>>>> ac05fbb inode: don't softlockup when evicting inodes
> >>>>>>>
> >>>>>>> but there hasn't been an audit of all walkers, so do that now.  This
> >>>>>>> also consistently moves the cond_resched() calls to the bottom of each
> >>>>>>> loop in cases where it already exists.
> >>>>>>>
> >>>>>>> One loop remains: remove_dquot_ref(), because I'm not quite sure how
> >>>>>>> to deal with that one w/o taking the i_lock.
> >>>>>>>
> >>>>>>> Signed-off-by: Eric Sandeen <sandeen@xxxxxxxxxx>
> >>>>>>
> >>>>>> Thanks Eric. The patch looks good to me. You can add:
> >>>>>>
> >>>>>> Reviewed-by: Jan Kara <jack@xxxxxxx>
> >>>>>
> >>>>> thanks
> >>>>>
> >>>>>> BTW, I suppose you need to add Al to pickup the patch?
> >>>>>
> >>>>> Yeah (cc'd now)
> >>>>>
> >>>>> But it was just pointed out to me that if/when the majority of inodes
> >>>>> at umount time have i_count == 0, we'll never hit the resched in 
> >>>>> fsnotify_unmount_inodes() and may still have an issue ...
> >>>>
> >>>> Yeah, that's a good point. So that loop will need some further tweaking
> >>>> (like doing iget-iput dance in need_resched() case like in some other
> >>>> places).
> >>>
> >>> Well, it's already got an iget/iput for anything with i_count > 0.  But
> >>> as the comment says (and I think it's right...) doing an iget/iput
> >>> on i_count == 0 inodes at this point would be without SB_ACTIVE and the final
> >>> iput here would actually start evicting inodes in /this/ loop, right?
> >>
> >> Yes, it would but since this is just before calling evict_inodes(), I have
> >> currently hard time remembering why evicting inodes like that would be an
> >> issue.
> > 
> > Probably just weird to effectively evict all inodes prior to evict_inodes() ;)
> > 
> >>> I think we could (ab)use the lru list to construct a "dispose" list for
> >>> fsnotify processing as was done in evict_inodes...
> > 
> > [narrator: Eric's idea here is dumb and it won't work]
> > 
> >>> or maybe the two should be merged, and fsnotify watches could be handled
> >>> directly in evict_inodes.  But that doesn't feel quite right.
> >>
> >> Merging the two would be possible (and faster!) as well but I agree it
> >> feels a bit dirty :)
> > 
> > It's starting to look like maybe the only option...
> > 
> > I'll see if Al is willing to merge this patch as is for the simple "schedule
> > the big loops" and see about a 2nd patch on top to do more surgery for this
> > case.
> 
> Sorry for thinking out loud in public but I'm not too familiar with fsnotify, so
> I'm being timid.  However, since fsnotify_sb_delete() and evict_inodes() are working
> on orthogonal sets of inodes (fsnotify_sb_delete only cares about nonzero refcount,
> and evict_inodes only cares about zero refcount), I think we can just swap the order
> of the calls.  The fsnotify call will then have a much smaller list to walk
> (any refcounted inodes) as well.
> 
> I'll try to give this a test.

Yes, this should make the softlockup impossible to trigger in practice. So
agreed.

								Honza

> 
> diff --git a/fs/super.c b/fs/super.c
> index cfadab2cbf35..cd352530eca9 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -448,10 +448,12 @@ void generic_shutdown_super(struct super_block *sb)
>  		sync_filesystem(sb);
>  		sb->s_flags &= ~SB_ACTIVE;
>  
> -		fsnotify_sb_delete(sb);
>  		cgroup_writeback_umount();
>  
> +		/* evict all inodes with zero refcount */
>  		evict_inodes(sb);
> +		/* only nonzero refcount inodes can have marks */
> +		fsnotify_sb_delete(sb);
>  
>  		if (sb->s_dio_done_wq) {
>  			destroy_workqueue(sb->s_dio_done_wq);
> 
> 
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux