I'll Cc blockdev On (03/27/18 08:36), bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: > > --- Comment #17 from sergey.senozhatsky.work@xxxxxxxxx --- > > On (03/26/18 13:05), bugzilla-daemon@xxxxxxxxxxxxxxxxxxx wrote: > > > Therefore the serial console is actually pretty fast. It seems that the > > > deadline > > > 10ms-per-character is not in the game here. > > > > As the name suggests this is dmesg - content of logbuf. We can't tell > > anything about serial consoles speed from it. > > Grrr, you are right. It would be interesting to see the output from > the serial port as well. > > Anyway, it does not change the fact that printing so many same lines is > useless. The throttling still would make sense and probably would > solve the problem. You are right. Looking at backtraces (https://bugzilla.kernel.org/attachment.cgi?id=274953&action=edit) there *probably* was just one CPU doing all printk-s and all printouts. And there was one CPU waiting for that printing CPU to unlock the queue spin_lock. The printing CPU was looping in scsi_request_fn() picking up requests and calling sdev_printk() for each of them, because the device was offline. Given that serial console is not very fast, that we called serial console under queue spin_lock and the number of printks called, it was enough to lockup the CPU which was spining on queue spin_lock and to hard lockup the system. scsi_request_fn() does unlock the queue lock later, but not in that !scsi_device_online(sdev) error case. scsi_request_fn() { for (;;) { int rtn; /* * get next queueable request. We do this early to make sure * that the request is fully prepared even if we cannot * accept it. */ req = blk_peek_request(q); if (!req) break; if (unlikely(!scsi_device_online(sdev))) { sdev_printk(KERN_ERR, sdev, "rejecting I/O to offline device\n"); scsi_kill_request(req, q); continue; ^^^^^^^^^ still under spinlock } } I'd probably just unlock/lock queue lock, rather than ratelimit printk-s, before `continue'. Dunno. James, Martin, what do you think? -ss