On Thu, Sep 09, 2010 at 01:20:43AM +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner@xxxxxxxxxx> > > Having multiple CPUs trying to do the same cache shrinking work can > be actively harmful to perforamnce when the shrinkers land in the > same AGs. They then lockstep on perag locks, causing contention and > slowing each other down. Reclaim walking is sufficiently efficient > that we do no need parallelism to make significant progress, so stop > parallel access at the door. > > Instead, keep track of the number of objects the shrinkers want > cleaned and make sure the single running shrinker does not stop > until it has hit the threshold that the other shrinker calls have > built up. > > This increases the cold-cache unlink rate of a 8-way parallel unlink > workload from about 15,000 unlinks/s to 60-70,000 unlinks/s for the > same CPU usage (~700%), resulting in the runtime for a 200M inode > unlink workload dropping from 4h50m to just under 1 hour. The code looks good, but long term I think this needs to be fixed in the caller, not in every shrinker instance. Reviewed-by: Christoph Hellwig <hch@xxxxxx> > + nr_to_scan += atomic64_read(&mp->m_ino_shrink_nr); > + atomic64_set(&mp->m_ino_shrink_nr, 0); To be totally race free this should use atomic64_cmpxchg. _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs