On Tue, Feb 27, 2024 at 7:40 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > On Tue, Feb 27, 2024 at 3:37 AM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > On Tue, Feb 27, 2024 at 5:44 PM Yosry Ahmed <yosryahmed@xxxxxxxxxx> wrote: > > > > > > On Tue, Feb 27, 2024 at 1:39 AM Yafang Shao <laoar.shao@xxxxxxxxx> wrote: > > > > > > > > On Tue, Feb 27, 2024 at 5:05 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > > > > > > On Tue 27-02-24 13:48:31, Yafang Shao wrote: > > > > > > On Mon, Feb 26, 2024 at 10:05 PM Michal Hocko <mhocko@xxxxxxxx> wrote: > > > > > [...] > > > > > > > > To manage disk > > > > > > > > storage efficiently, we employ an agent that identifies container images > > > > > > > > eligible for destruction once all instances of that image exit. > > > > > > > > > > > > > > > > However, during destruction, dealing with directories containing numerous > > > > > > > > negative dentries can significantly impact performance. > > > > > > > > > > > > > > Performance of what. I have to say I am kind of lost here. We are > > > > > > > talking about memory or a disk storage? > > > > > > > > > > > > Removing an empty directory with numerous dentries can significantly > > > > > > prolong the process of freeing associated dentries, leading to high > > > > > > system CPU usage that adversely affects overall system performance. > > > > > > > > > > Is there anything that prevents you from reclaiming the memcg you are > > > > > about to remove? We do have interfaces for that. > > > > > > > > Reclaiming numerous dentries through force_empty can also lead to > > > > potential issues, which is why we attempt to shrink the slab gradually > > > > to mitigate them. However, it's important to note that the underlying > > > > causes of the issues in force_empty and rmdir are not identical, as > > > > they involve different locks. > > > > > > > > > > > > > > > > > To mitigate this > > > > > > > > issue, we aim to proactively reclaim these dentries using a user agent. > > > > > > > > Extending the memory.reclaim functionality to specifically target slabs > > > > > > > > aligns with our requirements. > > > > > > > > > > > > > > Matthew has already pointed out that this has been proposed several > > > > > > > times already and rejected. > > > > > > > > > > > > With that being said, we haven't come up with any superior solutions > > > > > > compared to the proposals mentioned. > > > > > > > > > > > > > Dedicated slab shrinking interface is > > > > > > > especially tricky because you would need a way to tell which shrinkers > > > > > > > to invoke and that would be very kernel version specific. > > > > > > > > > > > > The persistence of this issue over several years without any > > > > > > discernible progress suggests that we might be heading in the wrong > > > > > > direction. Perhaps we could consider providing a kernel interface to > > > > > > users, allowing them to tailor the reclamation process based on their > > > > > > workload requirements. > > > > > > > > > > There are clear problems identified with type specific reclaim and those > > > > > might easily strike back with future changes. Once we put an interface > > > > > in place we won't be able remove it and that could lead to problems with > > > > > future changes in the memory reclaim. > > > > > > > > That shouldn't deter us from actively seeking a resolution to an issue > > > > that has persisted for tens of years. > > > > As observed, numerous memcg interfaces have been deprecated in recent years. > > > > > > There has been recent work to add a swapiness= argument to > > > memory.reclaim to balance between anon and file pages. Adding a type= > > > argument together with that is a recipe for eternal confusion. *If* we > > > want to support this, we need to have a way to combine these two into > > > something more user-friendly. > > > > What if we introduce a new file, like memory.shrink? This could serve > > as a foundation for potential future extensions, allowing us to shrink > > specific slabs with specific counts. > > Shrinking specific slabs is something that shouldn't be exposed as an > interface, as this is a kernel implementation detail. Also, If that's the case, why was slabs info initially exposed through /proc/slabinfo? Isn't that level of detail considered a kernel implementation detail? Currently, users can identify which slab is consuming the most memory but lack the ability to take action based on that information. This suggests a flaw in the kernel implementation. > memory.reclaim and memory.shrink would still have overlapping > functionalities. -- Regards Yafang