Dave, On Tue, Apr 13 2021 at 19:58, Dave Chinner wrote: > On Tue, Apr 13, 2021 at 01:18:35AM +0200, Thomas Gleixner wrote: > So for solving the inode cache scalability issue with RT in mind, > we're left with these choices: > > a) increase memory consumption and cacheline misses for everyone by > adding a spinlock per hash chain so that RT kernels can do their > substitution magic and make the memory footprint and scalability > for RT kernels worse > > b) convert the inode hash table to something different (rhashtable, > radix tree, Xarray, etc) that is more scalable and more "RT > friendly". > > c) have RT kernel substitute hlist-bl with hlist_head and a spinlock > so that it all works correctly on RT kernels and only RT kernels > take the memory footprint and cacheline miss penalties... > > We rejected a) for the dentry hash table, so it is not an appropriate > soltion for the inode hash table for the same reasons. > > There is a lot of downside to b). Firstly there's the time and > resources needed for experimentation to find an appropriate > algorithm for both scalability and RT. Then all the insert, removal > and search facilities will have to be rewritten, along with all the > subtlies like "fake hashing" to allow fielsysetms to provide their > own inode caches. The changes in behaviour and, potentially, API > semantics will greatly increase the risk of regressions and adverse > behaviour on both vanilla and RT kernels compared to option a) or > c). > > It is clear that option c) is of minimal risk to vanilla kernels, > and low risk to RT kernels. It's pretty straight forward to do for > both configs, and only the RT kernels take the memory footprint > penalty. > > So a technical analysis points to c) being the most reasonable > resolution of the problem. I agree with that analysis for technical reasons and I'm not entirely unfamiliar how to solve hlist_bl conversions on RT either as you might have guessed. Having a technical argument to discuss and agree on is far simpler than going along with "I don't care". Thanks for taking the time to put a technical rationale on this! tglx