On Mon, 23 Sept 2024 at 20:55, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > That's effectively what the patch did - it added a spinlock per hash > list. Yeah, no, I like your patches, they all seem to be doing sane things and improve the code. The part I don't like is how the basic model that your patches improve seems quite a bit broken. For example, that whole superblock list lock contention just makes me go "Christ, that's stupid". Not because your patches to fix it with Waiman's fancy list would be wrong or bad, but because the whole pain is self-inflicted garbage. And it's all historically very very understandable. It wasn't broken when it was written. That singly linked list is 100% sane in the historical context of "we really don't want anything fancy for this". The whole list of inodes for a superblock is basically unimportant, and it's main use (only use?) is for the final "check that we don't have busy inodes at umount time, remove any pending stuff". So using a stupid linked list was absolutely the right thing to do, but now that just the *locking* for that list is a pain, it turns out that we probably shouldn't use a list at all. Not because the list was wrong, but because a flat list is such a pain for locking, since there's no hierarchy to spread the locking out over. (We used to have that kind of "flat lock" for the dcache too, but "dcache_lock" went away many moons ago, and good riddance - but the only reason it could go away is that the dcache has a hierarchy, so that you can always lock just the local dentry (or the parent dentry) if you are just careful). > [ filesystems doing their own optimized thing ] > > IOWs, it's not clear to me that this is a caching model we really > want to persue in general because of a) the code duplication and b) > the issues such an inode life-cycle model has interacting with the > VFS life-cycle expectations... No, I'm sure you are right, and I'm just frustrated with this code that probably _should_ look a lot more like the dcache code, but doesn't. I get the very strong feeling that we should have a per-superblock hash table that we could also traverse the entries of. That way the superblock inode list would get some structure (the hash buckets) that would allow the locking to be distributed (and we'd only need one lock and just share it between the "hash inode" and "add it to the superblock list"). But that would require something much more involved than "improve the current code". Linus