On 2020/10/13 18:02, Petr Mladek wrote: > On Tue 2020-10-13 09:40:27, Tetsuo Handa wrote: >> On 2020/10/13 0:41, Michal Hocko wrote: >>>> What about introducing some feedback from the printk code? >>>> >>>> static u64 printk_last_report_seq; >>>> >>>> if (consoles_seen(printk_last_report_seq)) { >>>> dump_header(); >>>> printk_last_report_seq = printk_get_last_seq(); >>>> } >>>> >>>> By other words. It would skip the massive report when the consoles >>>> were not able to see the previous one. >>> >>> I am pretty sure this has been discussed in the past but maybe we really >>> want to make ratelimit to work reasonably also for larger sections >>> instead. Current implementation only really works if the rate limited >>> operation is negligible wrt to the interval. Can we have a ratelimit >>> alternative with a scope effect (effectivelly lock like semantic)? >>> if (rate_limit_begin(&oom_rs)) { >>> dump_header(); >>> rate_limit_end(&oom_rs); >>> } >>> >>> rate_limi_begin would act like a try lock with additional constrain on >>> the period/cadence based on rate_limi_end marked values. >>> >> >> Here is one of past discussions. >> >> https://lkml.kernel.org/r/7de2310d-afbd-e616-e83a-d75103b986c6@xxxxxxxxxxxxxxxxxxx >> https://lkml.kernel.org/r/20190830103504.GA28313@xxxxxxxxxxxxxx >> https://lkml.kernel.org/r/57be50b2-a97a-e559-e4bd-10d923895f83@xxxxxxxxxxxxxxxxxxx >> >> Michal Hocko complained about different OOM domains, and now just ignores it... > > How is this related to this discussion, please? AFAIK, we are > discussing how to tune the values of the existing ratelimiting. dump_tasks() is one of functions called from dump_header(). Since Michal wants to recognize OOM domains when ratelimiting dump_tasks(), ratelimit for dump_header() is also expected to recognize OOM domains. > >> Proper ratelimiting for OOM messages had better not to count on asynchronous printk(). > > I am a bit confused. AFAIK, you wanted to print OOM messages > asynchronous ways in the past. The lockless printk ringbuffer is on > its way into 5.10. Handling consoles in kthreads will be the next > step of the printk rework. What I'm proposing is synchronously printing OOM messages from a different thread, for one dump_tasks() call can generate thousands of lines which may significantly delay arrival of non OOM related messages to consoles (or even drop due to logbuf being full). I don't want to enqueue too many OOM related messages to logbuf, even after printk() became completely asynchronous. > > OK, the current state is that printk() is semi-synchronous. It does > console_trylock(). The console is handled immediately when it > succeeds. Otherwise it expects that the current console_lock owner > would do the job. > > Tuning ratelimits is not trivial for a particular system. It would > be better to have some autotuning. If the printk is synchronous, > we could measure how long the printing took. If it is asynchronous, > we could check whether the last report has been already flushed or > not. We could then decide whether to print the new report. Whether the last report has been already flushed needs to recognize OOM domains. > > What is the desired behavior, please? > > Could you please provide some examples how you would tune ratelimit > when printing all messages to the console takes X ms and OOM > happens every Y ms? My proposal is to decide whether to print the new report based on whether all OOM candidates for that OOM domain have been flushed to consoles. There is no X and Y.