On Thu, Apr 23, 2020 at 9:28 PM Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote: > > On 2020/04/23 20:07, Michal Hocko wrote: > > The existing loglevels we use are not really carved in stone and we can > > prioritize some more than others. dump_tasks is KERN_INFO already and > > this is quite a low priority so you shouldn't really miss much when > > omitting it. But I wouldn't mind making it KERN_DEBUG. > > Do you mean > > - pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", > + pr_debug("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", > > in order to suppress printing to consoles? Wow! That will also suppress > saving to log files because syslog daemon is likely configured not to save > KERN_DEBUG level messages. > > >> - tune the dump_tasks specifically (vm.oom_dump_tasks) > >> All the consumers are effected. > >> The logfile is fast enough, so we expect that these dump_tasks could > >> be printed into the logfile. > >> The console is so slow that we don't want to print into it. > >> A possilbe way to fix it is improve vm.oom_dump_tasks. > >> vm.oom_dump_tasks : 1 - dump into all consumers > >> 2 - don't dump into console > >> 0 - don't dump into any of > > > > How would that be implemented. I do not know of a way to tell printk > > which consoles to use for the output. Anyway, isn't this something > > that can be configured on the printk level. In other words send only > > important information to slow consoles? > > Last year I proposed > https://lkml.kernel.org/r/1550896930-12324-1-git-send-email-penguin-kernel@xxxxxxxxxxxxxxxxxxx > and Sergey Senozhatsky commented > > "This is a bit of a strange issue, to be honest. If OOM prints too > many messages then we might want to do some work on the OOM side." > > . I was thinking that printing to consoles is a requirement for oom_dump_tasks . > > If we can agree with not printing dump_tasks() output to consoles, a trivial > patch shown below will solve the problem. Those who cannot run syslog daemon > in userspace might disagree, but this will be the simplest answer. > > include/linux/kern_levels.h | 3 +++ > include/linux/printk.h | 1 + > kernel/printk/printk.c | 7 ++++++- > mm/oom_kill.c | 7 ++++--- > 4 files changed, 14 insertions(+), 4 deletions(-) > > diff --git a/include/linux/kern_levels.h b/include/linux/kern_levels.h > index bf2389c26ae3..cd69a9cb3c2a 100644 > --- a/include/linux/kern_levels.h > +++ b/include/linux/kern_levels.h > @@ -23,6 +23,9 @@ > */ > #define KERN_CONT KERN_SOH "c" > > +/* Annotation for "don't print to consoles". */ > +#define KERN_NO_CONSOLES KERN_SOH "S" > + > /* integer equivalents of KERN_<LEVEL> */ > #define LOGLEVEL_SCHED -2 /* Deferred messages from sched code > * are set to this special level */ > diff --git a/include/linux/printk.h b/include/linux/printk.h > index e061635e0409..da338b81c2e1 100644 > --- a/include/linux/printk.h > +++ b/include/linux/printk.h > @@ -19,6 +19,7 @@ static inline int printk_get_level(const char *buffer) > switch (buffer[1]) { > case '0' ... '7': > case 'c': /* KERN_CONT */ > + case 'S': /* KERN_NO_CONSOLES */ > return buffer[1]; > } > } > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c > index 9a9b6156270b..ed51641af087 100644 > --- a/kernel/printk/printk.c > +++ b/kernel/printk/printk.c > @@ -361,6 +361,7 @@ static int console_msg_format = MSG_FORMAT_DEFAULT; > */ > > enum log_flags { > + LOG_NO_CONSOLES = 1, /* don't print to consoles */ > LOG_NEWLINE = 2, /* text ended with a newline */ > LOG_CONT = 8, /* text is a fragment of a continuation line */ > }; > @@ -1959,6 +1960,9 @@ int vprintk_store(int facility, int level, > break; > case 'c': /* KERN_CONT */ > lflags |= LOG_CONT; > + break; > + case 'S': /* KERN_NO_CONSOLES */ > + lflags |= LOG_NO_CONSOLES; > } > > text_len -= 2; > @@ -2453,7 +2457,8 @@ void console_unlock(void) > break; > > msg = log_from_idx(console_idx); > - if (suppress_message_printing(msg->level)) { > + if ((msg->flags & LOG_NO_CONSOLES) || > + suppress_message_printing(msg->level)) { > /* > * Skip record we have buffered and already printed > * directly to the console when we received it, and > diff --git a/mm/oom_kill.c b/mm/oom_kill.c > index dfc357614e56..0b487c13a2c9 100644 > --- a/mm/oom_kill.c > +++ b/mm/oom_kill.c > @@ -400,7 +400,7 @@ static int dump_task(struct task_struct *p, void *arg) > return 0; > } > > - pr_info("[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", > + pr_info(KERN_NO_CONSOLES "[%7d] %5d %5d %8lu %8lu %8ld %8lu %5hd %s\n", > task->pid, from_kuid(&init_user_ns, task_uid(task)), > task->tgid, task->mm->total_vm, get_mm_rss(task->mm), > mm_pgtables_bytes(task->mm), > @@ -423,8 +423,9 @@ static int dump_task(struct task_struct *p, void *arg) > */ > static void dump_tasks(struct oom_control *oc) > { > - pr_info("Tasks state (memory values in pages):\n"); > - pr_info("[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); > + pr_info_once("Tasks state is sent to syslog.\n"); > + pr_info(KERN_NO_CONSOLES "Tasks state (memory values in pages):\n"); > + pr_info(KERN_NO_CONSOLES "[ pid ] uid tgid total_vm rss pgtables_bytes swapents oom_score_adj name\n"); > > if (is_memcg_oom(oc)) > mem_cgroup_scan_tasks(oc->memcg, dump_task, oc); > > I suggest to set KERN_NO_CONSOLES by default but the user can tune it back to the original behavior. I'm not a fan of sysctl, but if there's no better chioce, enhancing vm.oom_dump_tasks seems a possible solution. -- Thanks Yafang