On (03/07/19 15:21), John Ogness wrote: > > John, sorry to ask this, does new printk() design always provide > > latency guarantees good enough for PREEMPT_RT? > > Yes, because it is assumed that emergency messages will never occur for > a correctly running system. > [..] > Obviously as soon as any emergency message appears, an _unacceptable_ > latency occurs. But that is considered OK because the system is no > longer running correctly and it is worth the price to pay to get those > messages with such high reliability. OK, so what *I'm learning* from this bug report: 10) WARN/ERR messages do not necessarily tell us that the stability of the system was jeopardized. The system can "run correctly" and be "perfectly healthy". 20) We can have N CPUs reporting issues simultaneously. Even in production. Such patterns exist in the kernel. 30) The "reporting part" - printk()->call_console_drivers() - can be the slowest one. In this particular case, given that Mikulas saw dropped messages, checksum calculation was significantly faster than call_console_drivers(). Now, suppose we have new printk, and suppose we have CPUs A B C D, each of them reports a checksum error: A prb_lock owner B prb_lock C prb_lock D prb_lock A calls call_console_drivers, unlocks prb_lock B grabs prb_lock B calls call_console_drivers A calculates new checksum mismatch A calls printk and spins on prb_lock, behind D So now we have: B prb_lock owner C prb_lock D prb_lock A prb_lock And so on B C D A -> C D A B -> D A B C -> A B C D -> ... After M rounds of error reporting (M > N), each CPU, had have to busy wait M times (N - 1). Sounds quadratic. 40) goto 10 So I have some doubts regarding some of assumptions behind new printk design. And the problem is not in prb_lock() unfairness. Current printk design does look to me SMP-friendly; yes, it has unbound printing loop; that can be addressed. But it doesn't turn SMP system into UP. -ss -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel