On Wed, Sep 08, 2010 at 11:00:57PM -0400, Christoph Hellwig wrote: > On Thu, Sep 09, 2010 at 01:20:43AM +1000, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > Having multiple CPUs trying to do the same cache shrinking work can > > be actively harmful to perforamnce when the shrinkers land in the > > same AGs. They then lockstep on perag locks, causing contention and > > slowing each other down. Reclaim walking is sufficiently efficient > > that we do no need parallelism to make significant progress, so stop > > parallel access at the door. > > > > Instead, keep track of the number of objects the shrinkers want > > cleaned and make sure the single running shrinker does not stop > > until it has hit the threshold that the other shrinker calls have > > built up. > > > > This increases the cold-cache unlink rate of a 8-way parallel unlink > > workload from about 15,000 unlinks/s to 60-70,000 unlinks/s for the > > same CPU usage (~700%), resulting in the runtime for a 200M inode > > unlink workload dropping from 4h50m to just under 1 hour. > > The code looks good, but long term I think this needs to be fixed > in the caller, not in every shrinker instance. Agreed. > Reviewed-by: Christoph Hellwig <hch@xxxxxx> > > > + nr_to_scan += atomic64_read(&mp->m_ino_shrink_nr); > > + atomic64_set(&mp->m_ino_shrink_nr, 0); > > To be totally race free this should use atomic64_cmpxchg. Oh, I didn't realise that existed. I'll fix it. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs