On Fri, Sep 27, 2024 at 4:28 PM zhang fangzheng <fangzheng.zhang1003@xxxxxxxxx> wrote: > > On Thu, Sep 26, 2024 at 8:30 PM Vlastimil Babka <vbabka@xxxxxxx> wrote: > > > > On 9/25/24 15:18, Hyeonggon Yoo wrote: > > > On Wed, Sep 25, 2024 at 12:23 PM Fangzheng Zhang > > > <fangzheng.zhang@xxxxxxxxxx> wrote: > > >> > > >> Hi all, > > > > > > Hi Fangzheng, > > > > > >> A method to detect slub leaks by monitoring its usage in real time > > >> on the page allocation path of the slub. When the slub occupancy > > >> exceeds the user-set value, it is considered that the slub is leaking > > >> at this time > > > > > > I'm not sure why this should be a kernel feature. Why not write a user > > > script that parses > > > MemTotal: and Slab: part of /proc/meminfo file and generates a log > > > entry or an alarm? > > > > Yes very much agreed. It seems rather arbitrary. Why slab, why not any other > > kernel-specific counter in /proc/meminfo? Why include NR_SLAB_RECLAIMABLE_B > > when that's used by caches with shrinkers? > > Ok, this is because the current consideration is to specifically > track the memory usage of the slab module. > In the stability test, ie, monkey test, > the anr or reboot problem occurs, there is a high probability > that the slab occupancy is high when it comes to memory analysis. > In addition to directly monitoring leaks in the allocation path, it is > also convenient to record the allocation stack information > when an exception occurs. [+Cc Memory Allocation Profiling maintainers] For recording allocation information, I think CONFIG_MEM_ALLOC_PROFILING [1] [2] may be used to track allocation sites that contribute to memory leaks, instead of making the kernel panic or printing WARNING? .....Or with higher overhead, slub_debug=U [3] if it is not meant to be run on production. [1] https://docs.kernel.org/mm/allocation-profiling.html [2] https://lwn.net/Articles/974380 [3] https://docs.kernel.org/mm/slub.html#debugfs-files-for-slub Best, Hyeonggon > > A userspace solution should be straightforward and universal - easily > > configurable for different scenarios. > > > > >> and a panic operation will be triggered immediately. > > > > > > I don't think it would be a good idea to panic unnecessarily. > > > IMO it is not proper to panic when the kernel can still run. > > > > Yes these days it's practically impossible to add a BUG_ON() for more > > serious conditions than this. > > > > Please don't post new versions addressing specific implementation details > > until this fundamental issue is addressed. > > > > Thanks, > > Vlastimil > > > > > Any thoughts? > > > > > > Thanks, > > > Hyeonggon > >