On Tue 11-06-24 20:17:16, Dave Chinner wrote: > Your patch, however, just converts *some* of the lookup API > operations to use RCU. It adds complexity for things like inserts > which are going to need inode hash locking if the RCU lookup fails, > anyway. > > Hence your patch optimises the case where the inode is in cache but > the dentry isn't, but we'll still get massive contention on lookup > when the RCU lookup on the inode cache and inserts are always going > to be required. > > IOWs, even RCU lookups are not going to prevent inode hash lock > contention for parallel cold cache lookups. Hence, with RCU, > applications are going to see unpredictable contention behaviour > dependent on the memory footprint of the caches at the time of the > lookup. Users will have no way of predicting when the behaviour will > change, let alone have any way of mitigating it. Unpredictable > variable behaviour is the thing we want to avoid the most with core > OS caches. I don't believe this is what Mateusz's patches do (but maybe I've terribly misread them). iget_locked() does: spin_lock(&inode_hash_lock); inode = find_inode_fast(...); spin_unlock(&inode_hash_lock); if (inode) we are happy and return inode = alloc_inode(sb); spin_lock(&inode_hash_lock); old = find_inode_fast(...) the rest of insert code spin_unlock(&inode_hash_lock); And Mateusz got rid of the first lock-unlock pair by teaching find_inode_fast() to *also* operate under RCU. The second lookup & insertion stays under inode_hash_lock as it is now. So his optimization is orthogonal to your hash bit lock improvements AFAICT. Sure his optimization just ~halves the lock hold time for uncached cases (for cached it completely eliminates the lock acquisition but I agree these are not that interesting) so it is not a fundamental scalability improvement but still it is a nice win for a contended lock AFAICT. Honza -- Jan Kara <jack@xxxxxxxx> SUSE Labs, CR