Re: [PATCH] vfs: Avoid softlockups in drop_pagecache_sb()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue 29-01-19 16:56:36, Andrew Morton wrote:
> On Mon, 14 Jan 2019 09:53:43 +0100 Jan Kara <jack@xxxxxxx> wrote:
> 
> > When superblock has lots of inodes without any pagecache (like is the
> > case for /proc), drop_pagecache_sb() will iterate through all of them
> > without dropping sb->s_inode_list_lock which can lead to softlockups
> > (one of our customers hit this).
> > 
> > Fix the problem by going to the slow path and doing cond_resched() in
> > case the process needs rescheduling.
> > 
> > ...
> >
> > --- a/fs/drop_caches.c
> > +++ b/fs/drop_caches.c
> > @@ -21,8 +21,13 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
> >  	spin_lock(&sb->s_inode_list_lock);
> >  	list_for_each_entry(inode, &sb->s_inodes, i_sb_list) {
> >  		spin_lock(&inode->i_lock);
> > +		/*
> > +		 * We must skip inodes in unusual state. We may also skip
> > +		 * inodes without pages but we deliberately won't in case
> > +		 * we need to reschedule to avoid softlockups.
> > +		 */
> >  		if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) ||
> > -		    (inode->i_mapping->nrpages == 0)) {
> > +		    (inode->i_mapping->nrpages == 0 && !need_resched())) {
> >  			spin_unlock(&inode->i_lock);
> >  			continue;
> >  		}
> > @@ -30,6 +35,7 @@ static void drop_pagecache_sb(struct super_block *sb, void *unused)
> >  		spin_unlock(&inode->i_lock);
> >  		spin_unlock(&sb->s_inode_list_lock);
> >  
> > +		cond_resched();
> >  		invalidate_mapping_pages(inode->i_mapping, 0, -1);
> >  		iput(toput_inode);
> >  		toput_inode = inode;
> 
> Are we sure there's no situation in which a large number of inodes can
> be in the "unusual state", leading to the same issue?

No, we cannot be really sure that there aren't many such inodes (although
I'd be surprised if there were). But the problem with "unusual state"
inodes is that you cannot just __iget() them (well, you could but it's
breaking the rules and would lead to use-after-free issues ;) and so
there's no easy way to drop the spinlock and then continue the iteration
after cond_resched(). So overall it's too complex to deal with this until
someone actually hits it.

								Honza
-- 
Jan Kara <jack@xxxxxxxx>
SUSE Labs, CR



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux