On 10/15/2013 07:41 PM, Dave Chinner wrote: > On Tue, Oct 15, 2013 at 01:41:28PM -0400, Johannes Weiner wrote: >> I'm not forgetting about them, I just track them very coarsely by >> linking up address spaces and then lazily enforce their upper limit >> when memory is tight by using the shrinker callback. The assumption >> was that actually scanning them is such a rare event that we trade the >> rare computational costs for smaller memory consumption most of the >> time. > > Sure, I understand the tradeoff that you made. But there's nothing > worse than a system that slows down unpredictably because of some > magic threshold in some subsystem has been crossed and > computationally expensive operations kick in. The shadow shrinker should remove the radix nodes with the oldest shadow entries first, so true LRU should actually work for the radix tree nodes. Actually, since we only care about the age of the youngest shadow entry in each radix tree node, FIFO will be the same as LRU for that list. That means the shrinker can always just take the radix tree nodes off the end. >> But it >> looks like tracking radix tree nodes with a list and backpointers to >> the mapping object for the lock etc. will be a major pain in the ass. > > Perhaps so - it may not work out when we get down to the fine > details... I suspect that a combination of lifetime rules (inode cannot disappear until all the radix tree nodes) and using RCU free for the radix tree nodes, and the inodes might do the trick. That would mean that, while holding the rcu read lock, the back pointer from a radix tree node to the inode will always point to valid memory. That allows the shrinker to lock the inode, and verify that the inode is still valid, before it attempts to rcu free the radix tree node with shadow entries. It also means that locking only needs to be in the inode, and on the LRU list for shadow radix tree nodes. Does that sound sane? Am I overlooking something? -- All rights reversed -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html