On Tue, 23 Jan 2018 07:43:47 -0800 Tejun Heo <tj@xxxxxxxxxx> wrote: > So, at least in the case that we were seeing, it isn't that black and > white. printk keeps causing printks but only because printk buffer > flushing is preventing the printk'ing context from making forward > progress. The key problem there is that a flushing context may get > pinned flushing indefinitely and using a separate context does solve > the problem. > Does it? >From what I understand is that there's an issue with one of the printk consoles, due to memory pressure or whatnot. Then a printk happens within a printk recursively. It gets put into the safe buffer and an irq is sent to printk this printk. The issue you are saying is that when the printk enables interrupts, the irq work triggers and loads the log buffer with the safe buffer, and then the printk sees the new data added and continues to print, and hence never leaves this printk. Your solution is to delay the flushing of the safe buffer to another thread (work queue), which I also have issues with, because you break the "get printks out ASAP mantra". Then the work queue comes in and flushes the printks. And since the printks cause printks, we continue to spam the machine, but hey, we are making forward progress. Again, this is treating the symptom and not solving the problem. I really hate delaying printks to another thread, unless we can guarantee that that thread is ready to go immediately (basically spinning on a run queue waiting to print). Because if the system is having issues (which is the main reason for printks to happen), there's no guarantee that a work queue or another thread will ever schedule, and the safe printk buffer never gets out to the consoles. I much rather have throttling when recursive printks are detected. Make it a 100 lines to print if you want, but then throttle. Because once you have 100 lines or so, you will know that printks are causing printks, and you don't give a crap about the repeated process. Allow one flushing of the printk safe buffers, and then if it happens again, throttle it. Both methods can lose important data. I believe the throttling of recursive printks, after 100 prints or whatever, will be the least likely to lose important data, because printks caused by printks will just keep repeating the same data, and we don't care about repeats. But delaying the flushing could very well lose important data that caused a lockup. -- Steve -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@xxxxxxxxx. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href