Re: [PATCH v3 4/6] ext4: change lru to round-robin in extent status tree shrinker

Jan Kara <jack@xxxxxxx> · Mon, 8 Sep 2014 17:47:47 +0200



On Thu 04-09-14 11:44:59, Ted Tso wrote:
> On Thu, Sep 04, 2014 at 09:15:53AM +0200, Jan Kara wrote:
> >   Ah, sorry. I was mistaken and thought we do check for __GFP_FS in
> > ext4_es_scan() but we don't and we don't need to. But thinking about it
> > again - if we're going to always scan at most nr_to_scan cache entries,
> > there's probably no need to reduce s_es_lock latency by playing with
> > spinlock_contended(), right?
> 
> I'm more generally worried contention on s_es_lock, since it's a file
> system-wide spinlock that is grabbed whenever we need to add or remove
> an inode from the es_list.  So if someone were to try to run AIM7
> benchmark on a large core count machine on an ext4 file system mounted
> on a ramdisk, this lock would likely show up.
> 
> Now, this might not be a realistic scenario, but it's a common way to
> test for fs scalability without having a super-expensive RAID array,
> so it's quite common if you look at FAST papers over the last couple
> of years, for example..
> 
> So my thinking was that if we do run into contention, the shrinker
> thread should always yield, since if it gets slowed down slightly,
> there's no harm done.  Hmmm.... OTOH, the extra cache line bounce
> could potentially be worse, so maybe it would be better to let the
> shrinker thread do its thing and then get out of there.
  Yeah. I think cache bouncing limits scalability in a similar way spinlock
itself does so there's no big win in shortening the lock hold times. If
someone is concerned about scalability of our extent cache LRU, we could
use some more fancy LRU implementation like the one implemented in
mm/list_lru.c and used for other fs objects. But I would see that as a
separate step and only once someone can show a benefit...

								Honza
-- 
Jan Kara <jack@xxxxxxx>
SUSE Labs, CR
--
To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html