On Wed, 2 Oct 2024 at 05:35, Dave Chinner <david@xxxxxxxxxxxxx> wrote: > > On Wed, Oct 02, 2024 at 12:00:01PM +0200, Christian Brauner wrote: > > > I don't have big conceptual issues with the series otherwise. The only > > thing that makes me a bit uneasy is that we are now providing an api > > that may encourage filesystems to do their own inode caching even if > > they don't really have a need for it just because it's there. So really > > a way that would've solved this issue generically would have been my > > preference. > > Well, that's the problem, isn't it? :/ > > There really isn't a good generic solution for global list access > and management. The dlist stuff kinda works, but it still has > significant overhead and doesn't get rid of spinlock contention > completely because of the lack of locality between list add and > remove operations. I much prefer the approach taken in your patch series, to let the filesystem own the inode list and keeping the old model as the "default list". In many ways, that is how *most* of the VFS layer works - it exposes helper functions that the filesystems can use (and most do), but doesn't force them. Yes, the VFS layer does force some things - you can't avoid using dentries, for example, because that's literally how the VFS layer deals with filenames (and things like mounting etc). And honestly, the VFS layer does a better job of filename caching than any filesystem really can do, and with the whole UNIX mount model, filenames fundamentally cross filesystem boundaries anyway. But clearly the VFS layer inode list handling isn't the best it can be, and unless we can fix that in some fundamental way (and I don't love the "let's use crazy lists instead of a simple one" models) I do think that just letting filesystems do their own thing if they have something better is a good model. That's how we deal with all the basic IO, after all. The VFS layer has lots of support routines, but filesystems don't *have* to use things like generic_file_read_iter() and friends. Yes, most filesystems do use generic_file_read_iter() in some form or other (sometimes raw, sometimes wrapped with filesystem logic), because it fits their model, it's convenient, and it handles all the normal stuff well, but you don't *have* to use it if you have special needs. Taking that approach to the inode caching sounds sane to me, and I generally like Dave's series. It looks like an improvement to me. Linus