On Wed, Mar 16, 2022 at 01:52:23PM +1100, Dave Chinner wrote: > On Wed, Mar 16, 2022 at 10:07:19AM +0800, Gao Xiang wrote: > > On Tue, Mar 15, 2022 at 01:56:18PM -0700, Roman Gushchin wrote: > > > > > > > On Mar 15, 2022, at 12:56 PM, Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote: > > > > > > > > The number of negative dentries is effectively constrained only by memory > > > > size. Systems which do not experience significant memory pressure for > > > > an extended period can build up millions of negative dentries which > > > > clog the dcache. That can have different symptoms, such as inotify > > > > taking a long time [1], high memory usage [2] and even just poor lookup > > > > performance [3]. We've also seen problems with cgroups being pinned > > > > by negative dentries, though I think we now reparent those dentries to > > > > their parent cgroup instead. > > > > > > Yes, it should be fixed already. > > > > > > > > > > > We don't have a really good solution yet, and maybe some focused > > > > brainstorming on the problem would lead to something that actually works. > > > > > > I’d be happy to join this discussion. And in my opinion it’s going beyond negative dentries: there are other types of objects which tend to grow beyond any reasonable limits if there is no memory pressure. > > > > +1, we once had a similar issue as well, and agree that is not only > > limited to negative dentries but all too many LRU-ed dentries and inodes. > > Yup, any discussion solely about managing buildup of negative > dentries doesn't acknowledge that it is just a symptom of larger > problems that need to be addressed. Yes, but let's not make this _so_ broad a discussion that it becomes unsolvable. Rather, let's look for a solution to this particular problem that can be adopted by other caches that share a similar problem. For example, we might be seduced into saying "this is a slab problem" because all the instances we have here allocate from slab. But slab doesn't have enough information to solve the problem. Maybe the working set of the current workload really needs 6 million dentries to perform optimally. Maybe it needs 600. Slab can't know that. Maybe slab can play a role here, but the only component which can know the appropriate size for a cache is the cache itself. I think the logic needs to be in d_alloc(). Before it calls __d_alloc(), it should check ... something ... to see if it should try to shrink the LRU list. The devil is in what that something should be. I'm no expert on the dcache; do we just want to call prune_dcache_sb() for every 1/1000 time? Rely on DCACHE_REFERENCED to make sure that we're not over-pruning the list? If so, what do we set nr_to_scan to? 1000 so that we try to keep the dentry list the same size? 1500 so that it actually tries to shrink? I don't feel like I know enough to go further here. But it feels better than what we currently do -- calling all the shrinkers from deep in the page allocator.