On Fri, Oct 15, 2010 at 02:30:17PM +1100, Nick Piggin wrote: > On Fri, Oct 15, 2010 at 02:13:43PM +1100, Dave Chinner wrote: > > You've shown it can be done, and that's great - it shows > > us the impact of making those changes, but they need to be analysed > > separately and treated on own their merits, not lumped with core > > locking changes necessary for store-free path walking. > > Actually I didn't see anyone else object to doing this. Everybody > else it seems acknowledges that it needs to be done, and it gets > done naturally as a side effect of fine grained locking. Let's just get back to this part, which seems to be one you have the most issues with maybe? You're objecting to per-zone locks and per-zone LRUs for inode and dcache? Well I have told you why per-zone LRUs are needed, I can expand on any of the reasons if that is unclear. Per-zone locks I think come naturally at the same time and they will expose some fs bottlenecks, but that is simply how scalability development works. So, do you object to per-zone LRUs in particular, or per-zone locks? (Ie. the potentially changed reclaim pattern, or the increased parallelism). When you looked at this initially, you didn't understand how reclaim works. It will not fill up a zone with inodes and then start reclaiming all those inodes, leaving other nodes empty (unless that is how you configure the machine, but it isn't the default). It fills up inodes from all nodes (same as today) and it will start reclaiming from all nodes at about the same pressure when there is a shortage. Reclaim basically approximates LRU by scanning a little from the top of each LRU. When you have many thousands of objects, and reclaim is a really failable and dumb process anyway, then the perturbation of the reclaim pattern doesn't matter much. Our zone based page reclaim works exactly the same way. I don't think you can possibly be arguing against more scalable locking in reclaim, so perhaps you are also worried about increased parallelism in the filesystem callbacks from reclaim? I really can't see this being a big problem, any more than any other increased paralellism on fses or other subsystems caused by scaling vfs. There might be some interesting issues with different locking designs being hit in different ways, but really we can't stop progress and test all loads on all filesystems. The way forward is to fix the bottleneck in the filesystem, or the filesystem sucks so bad it can't handle it, then just put a lock in there and not peanalise others. It's not like I haven't tested it, I've spent the better part of the past year testing things. The I_FREEING batching stuff is one example where I found and fixed a small problem exposed by the reclaim changes. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html