On Tue, Sep 14, 2010 at 12:27:42PM -0400, Christoph Hellwig wrote: > On Tue, Sep 14, 2010 at 08:56:04PM +1000, Dave Chinner wrote: > > From: Dave Chinner <dchinner@xxxxxxxxxx> > > > > With delayed logging greatly increasing the sustained > > parallelism of inode operations, the inode cache locking is > > showing significant read vs write contention when inode reclaim > > runs at the same time as lookups. There is also a lot more write > > lock acquistions than there are read locks (4:1 ratio) so the > > read locking is not really buying us much in the way of > > parallelism. > > That's just for your parallel creates workload, isn't it? If we'd > get that bad hit rates on normal workloads something is pretty > wrong with the inode cache. I see it on the unlink workloads, as well as any use-once workload I care to run (e.g. du, find, grep, xfsdump, xfs_fsr, etc) on a fileystem with more inodes in it that can fit in the currently available memory. When we have userspace memory pressure, the amount of memory available for caching can be pretty low, so this is quite a common situation. > For a workload with 4 times as many writelocks just changing the > rwlock to a spinlock should provide more benefits. Did you test what > effect this has on other workloads? Not yet - I've run it through xfsqa on single CPU, 2p and 4p machines for the past few days, but I haven't benchmarked it comparitively on other workloads yet. The whole patchset needs that.... > In addition to that I feel really uneasy about changes to the inode > cache locking without really heavy NFS server testing - we just had too > many issues in this area in the past. Agreed, and that's something I need to test against. > > To avoid the read vs write contention, change the cache to use RCU locking on > > the read side. To avoid needing to RCU free every single inode, use the built > > in slab RCU freeing mechanism. This requires us to be able to detect lookups of > > freed inodes, so en??ure that ever freed inode has an inode number of zero and > > the XFS_IRECLAIM flag set. We already check the XFS_IRECLAIM flag in cache hit > > lookup path, but also add a check for a zero inode number as well. > > How does this interact with slab poisoning? mm/slab.c::kmem_cache_create(): if (!(flags & SLAB_DESTROY_BY_RCU)) flags |= SLAB_POISON; #endif if (flags & SLAB_DESTROY_BY_RCU) BUG_ON(flags & SLAB_POISON); and SLUB does an equivalent setup where OBJECT_POISON is not set if SLAB_DESTROY_BY_RCU is set. So it effectively turns off slab poisoning, but leaves all the other debug checks active. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx _______________________________________________ xfs mailing list xfs@xxxxxxxxxxx http://oss.sgi.com/mailman/listinfo/xfs