On Thu 23-04-20 14:35:22, Yafang Shao wrote: > On Thu, Apr 23, 2020 at 1:35 PM Tetsuo Handa [...] > > dump_tasks() remains definitely a printk() abuser which is capable of pushing > > many thousands of printk() messages in one second if async printk were available. > > Async printk CANNOT deal with the problem that too much backlog causes important > > messages to be delayed for too long. Please read my explanation carefully. > > > > Agreed. Too much oom reports still be a issue even if the printk() is asyn. I believe nobody is disputing this part. We are talking about two things here and I believe that contributes to a confusion considerably 1) dump_tasks being a large noise generator to the kernel log buffer 2) a heavy printk load from the oom context There is no good answer for 1). We simply print a lot of data that scales with the number eligible tasks and that might be thousands. We have done quite a lot of work to make the data collecting part of the process as optimal as possible but having this feature enabled by default is simply a package we have to carry with us. printk doesn't cope with such a load really great currently. There might be some future changes but the underline is that no matter how printk gets optimized there is still the payload to be printed. No matter this happens transparently async or explicitly done in a detached context. 2) is about the sync nature of the printk _right_now_ and that causes delays in the allocator context while the system is OOM. There are locks held both by the OOM context and in the call chain to the allocator potentially. The longer the oom context is going to take the longer is the agony going to take. Here is where the async printing might help because it would push out the heavy lifting to a different context. There is a clear agreement in this part. The whole discussion in this thread is about how to achieve that. There are two ways. Develop a code to do that for this very specific case (aka push out to a worker) or rely on printk doing that for us and potentially many other places in a similar situation. I am definitely for the later option because a) it adds less code we have to maintain and b) it is a more generic solution. For the current or older kernels there are two ways to workaround for the problem and floods of oom killer events doesn't seem to be be a regular production system state (I would even dare to claim that something is terribly wrong if yes) so no quick&dirty hacks are due. Either tune the log level or simply disable dump_tasks. It is an useful tool in some cases but not really necessary in the vast majority of cases. > I think the aysnc printk() won't care about wheter the data is > improtant or not, so the user of printk() (even if it is asyn) should > have a good management of these data especially if these data may > consume all or most of the printk buffer. Not sure what you mean here. We do have an option to tune the ring buffer (both size and log levels) and dump_tasks specifically. -- Michal Hocko SUSE Labs