On Sat, Oct 16, 2010 at 08:42:57PM -0400, Christoph Hellwig wrote: > On Sun, Oct 17, 2010 at 04:09:11AM +1100, Nick Piggin wrote: > > If you want it to be scalable within a single sb, it needs to be > > per cpu. If it is per-cpu it does not need to be per-sb as well > > which just adds bloat. > > Right now the patches split up the inode lock and do not add > per-cpu magic. It's not any more work to move from per-sb lists > to per-cpu locking if we eventually do it than moving from global > to per-cpu. But it's more work to do per-sb lists than a global list, and as I'm going to a per-cpu locking anyway it's a strange transition to go from per-sb to per-cpu (rather than per-sb, per-cpu). In short, the fact that I build up the locking transformations starting with global locks is just not something that can be held against my patch set (unless you really disagree with the whole concept of how the series is structured). > > I'm not entirely convinced moving s_inodes to a per-cpu list is a good > idea. For now per-sb is just fine for disk filesystems as they have > much more fs-wide cachelines they touch for inode creatation/deletion > anyway, and for sockets/pipes a variant of your patch to not ever > add them to s_inodes sounds like the better approach. Traditional filesystems on slow spinning disk are not the main problem. It's very fast ssds and storage servers. XFS actually with its per-AG lock splitting can already have problems on small servers with not-incredibly-fast storage with per-sb scalability bottlenecks. And if the VFS is not scalable, then the contention doesn't even get pushed into the filesystem so the fs developers never even _see_ the locking problems to fix them. I'm telling you it will be increasingly a problem because cores and storage speeds continue to increase, and also people want to manage more storage with fewer filesystems. It's obvious that it will be a problem. I've already got per cpu locking in vfsmounts and files lock, so it's not magic. > If we eventually hit the limit for disk filesystems I have some better > ideas to solve this. One is to abuse whatever data sturcture we use > for the inode hash also for iterating over all inodes - we only > iterate over them in very few places, and none of them is a fast path. Doing your handwaving about changing data types and better ideas is just not helpful. _If_ you do have some better ideas, and _if_ we change the data structure, _then_ it's trivial to change from percpu locking to your better idea. It just doesn't work as an argument to slow progress. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html