Re: [BUG] workqueues and printk not playing nice since next-20240130

"Paul E. McKenney" <paulmck@xxxxxxxxxx> · Fri, 2 Feb 2024 09:35:59 -0800

On Fri, Feb 02, 2024 at 06:08:25PM +0106, John Ogness wrote:
> On 2024-02-02, "Paul E. McKenney" <paulmck@xxxxxxxxxx> wrote:
> >> The printk ringbuffer contents would certainly be interesting.
> >> 
> >> If you build the GDB scripts (CONFIG_GDB_SCRIPTS) then you will have:
> >> 
> >> (gdb) lx-dmesg
> >
> > This says no such command even though I do have CONFIG_GDB_SCRIPTS=y
> > in my .config.
> 
> You actually need to build them as well. The target is "scripts_gdb"
> 
> And you probably need to add:
> 
> add-auto-load-safe-path /path/to/your/kernel/build/directory
> 
> to your .gdbinit
> 
> (This is documented in Documentation/dev-tools/gdb-kernel-debugging.rst)

Thank you!  Next time I am in a similar situation, I will pay more
attention to the documentation.

> >> As an alternative, you could copy the contents of
> >> Documentation/admin-guide/kdump/gdbmacros.txt into your .gdbinit and
> >> then will have:
> >> 
> >> (gdb) dmesg
> >
> > This one hangs.
> 
> :-/ I will look into this.
> 
> > On the other hand, next-20240202 doesn't show the problem.  No idea
> > what might have changed.  :-/
> 
> Did you check the backtrace on all the "threads"? I would expect one of
> them has tty in it and is probably deadlocked. There are known problems
> that if a WARN or lockdep triggers while holding the port lock, that CPU
> will deadlock itself. That has the effect that no output is generated,
> but all the other CPUs will run fine. And even printk() calls will
> happily store into the ringbuffer because they use trylock for printing
> and the deadlocked CPU will be holding the lock.

Again, thank you, and another thing for me to try should this start
happening again.

							Thanx, Paul