On (04/23/18 14:45), Petr Mladek wrote: [..] > I am not sure how slow are the slowest consoles. If I take that > everything should be faster than 1200 bauds. Then 10 minutes > should be enough for 1000 lines and 80 characters per-line: Well, the problem with the numbers is that they are too... simple... let me put it this way. What if I don't have a slow serial console? Or what if I have NMI watchdog set to 40 seconds? Or what if I don't have NMIs at all? Why am I all of a sudden limited by "1200 bauds"? Another problem is that we limit the *wrong* thing. Not only because we can [and probably need to] rate-limit the misbehaving code that calls printk()-s, instead of printk(). But because we claim that we limit the "number of lines" added recursively. This is wrong. We limit the number of times vprintk_func() was called, which is != the number of added lines. Because vprintk_func() is also called for pr_cont() or printk(KERN_CONT) or printk("missing new line"). Backtraces contain tons and tons of pr_cont()-s - registers print out, list of modules print out, stack print out, code print out. Even this thing at the bottom of a trace: Code: 01 ca 49 89 d1 48 89 d1 48 c1 ea 23 48 8b 14 d5 80 23 63 82 49 c1 e9 0c 48 c1 e9 1b 48 85 d2 74 0a 0f b6 c9 48 c1 e1 04 48 01 ca <48> 8b 12 49 c1 e1 06 b9 00 00 00 80 89 7d 80 89 75 84 48 8b 3d is nothing but a bunch of pr_cont()-s, each of which will individually end up in vprintk_func(). Error reports in general can contain even more pr_cont() calls. E.g. core kernel code can hex dump slab memory, while being called from one of console drivers. Another problem is that nothing tells us that we *actually* have an infinite loop. Nothing tells us that every call_console_drivers() adds more messages to the logbuf. We see just one thing - the current call_console_drivers() is about to add some lines to the logbuf later on. OK, why is this a problem? This can be a one time thing. Or console_unlock() may be in a schedulable context, getting rescheduled after every line it prints [either implicitly after printk_safe_exit_irqrestore(), or explicitly by calling into the scheduler - cond_resched()]. Most likely, we don't even realize how many things we are about to break. > Alternatively, it seems that we are going to call console drivers > outside printk_safe context => the messages will appear in the main > log buffer immediately => only small risk of a ping-pong with printk > safe buffers. We might reset the counter when all messages are handled > in console_unlock(). It will be more complex patch than when using > ratelimiting but it still should be sane. We may have some sort of vprintk_func()-based solution, may be. But we first need a real reason. Right now it looks to me like we have "a solution" to a problem which we have never witnessed. That vprintk_func()-based solution, if there will be no other options on the table, must be much smarter than anything that we have seen so far. Sorry. -ss