On Wed 19-06-24 10:30:46, Michal Hocko wrote: > On Wed 19-06-24 01:03:16, Shakeel Butt wrote: > > On Wed, Jun 19, 2024 at 09:19:41AM GMT, Michal Hocko wrote: > > > On Tue 18-06-24 14:34:21, Shakeel Butt wrote: > > > > At the moment oversize kvmalloc warnings are triggered once using > > > > WARN_ON_ONCE() macro. One issue with this approach is that it only > > > > detects the first abuser and then ignores the remaining abusers which > > > > complicates detecting all such abusers in a timely manner. The situation > > > > becomes worse when the repro has low probability and requires production > > > > traffic and thus require large set of machines to find such abusers. In > > > > Mera production, this warn once is slowing down the detection of these > > > > abusers. Simply replace WARN_ON_ONCE with WARN_RATELIMIT. > > > > > > Long time ago, I've had a patch to do the once_per_callsite WARN. I > > > cannot find reference at the moment but it used stack depot to note > > > stacks that have already triggered. Back then there was no reponse on > > > the ML. Should I try to dig deep and recover it from my archives? I > > > think this is exactly kind of usecase where it would fit. > > > > > > > Do you mean something like warn once per unique call stack? > > Exactly! > > > If yes then > > I think that is better than the simple ratelimiting version as > > ratelimiting one may still miss some abusers and also may keep warning > > about the same abuser. Please do share your patch. > > https://lore.kernel.org/all/20170103134424.28123-1-mhocko@xxxxxxxxxx/ Btw. the code has changed a lot since 2017 when this was posted so it will likely need a lot of massaging to rebase. Also I am not entirely sure it is ok to change WARN_ONCE semantic like that anymore. Maybe we need an explicit variant that does this per-call-site warnings. It is a notable difference between library functions which can be called from different callpaths and those that are used from a single place. I do not have much time to dig deeper into this but if you want to take over then go ahead. I still think this is a useful WARN_ONCE or in general do_something_once semantic. -- Michal Hocko SUSE Labs