On Tue, Apr 06, 2021 at 10:33:40PM +1000, Dave Chinner wrote: > Hi folks, > > Recently I've been doing some scalability characterisation of > various filesystems, and one of the limiting factors that has > prevented me from exploring filesystem characteristics is the > inode hash table. namely, the global inode_hash_lock that protects > it. > > This has long been a problem, but I personally haven't cared about > it because, well, XFS doesn't use it and so it's not a limiting > factor for most of my work. However, in trying to characterise the > scalability boundaries of bcachefs, I kept hitting against VFS > limitations first. bcachefs hits the inode hash table pretty hard > and it becaomse a contention point a lot sooner than it does for > ext4. Btrfs also uses the inode hash, but it's namespace doesn't > have the capability to stress the indoe hash lock due to it hitting > internal contention first. > > Long story short, I did what should have been done a decade or more > ago - I converted the inode hash table to use hlist-bl to split up > the global lock. This is modelled on the dentry cache, with one > minor tweak. That is, the inode hash value cannot be calculated from > the inode, so we have to keep a record of either the hash value or a > pointer to the hlist-bl list head that the inode is hashed into so > taht we can lock the corect list on removal. > > Other than that, this is mostly just a mechanical conversion from > one list and lock type to another. None of the algorithms have > changed and none of the RCU behaviours have changed. But it removes > the inode_hash_lock from the picture and so performance for bcachefs > goes way up and CPU usage for ext4 halves at 16 and 32 threads. At > higher thread counts, we start to hit filesystem and other VFS locks > as the limiting factors. Profiles and performance numbers are in > patch 3 for those that are curious. > > I've been running this in benchmarks and perf testing across > bcachefs, btrfs and ext4 for a couple of weeks, and it passes > fstests on ext4 and btrfs without regressions. So now it needs more > eyes and testing and hopefully merging.... These patches have been in the bcachefs repo for a bit with no issues, and they definitely do help with performance - thanks, Dave!