On Tue, Jan 20, 2009 at 10:22:00PM -0800, Mike Waychison wrote: > Andi Kleen wrote: > >Mike Waychison <mikew@xxxxxxxxxx> writes: > > > >>livelock on dcache_lock/inode_lock (specifically in > >>atomic_dec_and_lock()) > > > >I'm not sure how something can livelock in atomic_dec_and_lock which > >doesn't take a spinlock itself? Are you saying you run into NUMA memory > >unfairness here? Or did I misparse you? > > By atomic_dec_and_lock, I really meant to say _atomic_dec_and_lock(). Ok. So it's basically just the lock that is taken? In theory one could likely provide an x86 specific dec-and_lock that might perform better and doesn't lock if the count is still > 0, but that would only help if the reference count is still > 0. Is that a common situation in your test? > It takes the spinlock if the cmpxchg hidden inside atomic_dec_unless fails. > > There are likely NUMA unfairness issues at play, but it's not the main > worry at this point. > > > > >>This patchset is an attempt to try and reduce the locking overheads > >>associated > >>with final dput() and final iput(). This is done by batching dentries and > >>inodes into per-process queues and processing them in 'parallel' to > >>consolidate > >>some of the locking. > > > >I was wondering what this does to the latencies when dput/iput > >is only done for very objects. Does it increase costs then > >significantly? > > very objects? Sorry. "is only done for very few objects". Somnhow the few got lost. Basically latency in the unloaded case. I always worry when people do complicated things for the high load case how the more usual "do it for a single object" workload fares. > > > > >As a high level comment it seems like a lot of work to work > >around global locks, like the inode_lock, where it might be better to > >just split the lock up? Mind you I don't have a clear proposal > >how to do that, but surely it's doable somehow. > > > > Perhaps.. the only plausible way I can think this would be doable would > be to rework the global resources (like the global inode_unused LRU list One simple way would be to just use multiple lists with an own lock each. I doubt that would impact the LRU behaviour very much. > and deal with inode state transitions), but even then, some sort of > consistency needs to happen at the super_block level, The sb could also look at multiple lists? -Andi -- ak@xxxxxxxxxxxxxxx -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html