On Wed 2015-12-09 15:50:07, Andrew Morton wrote: > On Wed, 9 Dec 2015 14:21:02 +0100 Petr Mladek <pmladek@xxxxxxxx> wrote: > > > printk() takes some locks and could not be used a safe way in NMI > > context. > > > > The chance of a deadlock is real especially when printing > > stacks from all CPUs. This particular problem has been addressed > > on x86 by the commit a9edc8809328 ("x86/nmi: Perform a safe NMI stack > > trace on all CPUs"). > > > > This patch reuses most of the code and makes it generic. It is > > useful for all messages and architectures that support NMI. > > > > The patch is heavily based on the draft from Peter Zijlstra, > > see https://lkml.org/lkml/2015/6/10/327 > > > > I guess this code is useful even on CONFIG_SMP=n: to avoid corruption > of the printk internal structures whcih the problematic locking > protects. Yup and it is used even on CONFIG_SMP=n if I am not missing something. At least, CONFIG_PRINTK_NMI stays enabled here. > > +#define NMI_LOG_BUF_LEN (4096 - sizeof(atomic_t) - sizeof(struct irq_work)) > > + > > +struct nmi_seq_buf { > > + atomic_t len; /* length of written data */ > > + struct irq_work work; /* IRQ work that flushes the buffer */ > > + unsigned char buffer[NMI_LOG_BUF_LEN]; > > When this buffer overflows, which characters get lost? Most recent or > least recent? The most recent messages are lost when the buffer overflows. The other way would require to use a ring-buffer instead the seq_buf. We would need a lock-less synchronization for both, begin and end, pointers. It would add quite some complications. > I'm not sure which is best, really. For an oops trace you probably > want to preserve the least recent output: the stuff at the start of the > output. I agree. Fortunately, this is easier and it works this way. > > +static void __printk_nmi_flush(struct irq_work *work) > > +{ > > + static raw_spinlock_t read_lock = > > + __RAW_SPIN_LOCK_INITIALIZER(read_lock); > > + struct nmi_seq_buf *s = container_of(work, struct nmi_seq_buf, work); > > + int len, size, i, last_i; > > + > > + /* > > + * The lock has two functions. First, one reader has to flush all > > + * available message to make the lockless synchronization with > > + * writers easier. Second, we do not want to mix messages from > > + * different CPUs. This is especially important when printing > > + * a backtrace. > > + */ > > + raw_spin_lock(&read_lock); > > + > > + i = 0; > > +more: > > + len = atomic_read(&s->len); > > + > > + /* > > + * This is just a paranoid check that nobody has manipulated > > + * the buffer an unexpected way. If we printed something then > > + * @len must only increase. > > + */ > > + WARN_ON(i && i >= len); > > hm, dumping a big backtrace in this context seems a poor idea. Oh > well, shouldn't happen. I see and the backtrace probably would not help much because "len" might be manipulated also from NMI context. I am going to change this to: if (i && i >= len) pr_err("printk_nmi_flush: internal error: i=%d >= len=%d)\n", i, len); > > + if (!len) > > + goto out; /* Someone else has already flushed the buffer. */ > > + > > + /* Make sure that data has been written up to the @len */ > > + smp_rmb(); > > + > > + size = min_t(int, len, sizeof(s->buffer)); > > len and size should have type size_t. OK > > --- /dev/null > > +++ b/kernel/printk/printk.h > > I find it a bit irritating to have duplicated filenames. We could > follow convention and call this "internal.h". No problem. I am going to send an updated patchset soon. Thanks a lot for review, Petr