On Fri 2024-10-18 09:20:19, John Ogness wrote: > On 2024-10-17, Petr Mladek <pmladek@xxxxxxxx> wrote: > > # echo h >/proc/sysrq-trigger > > > > produced: > > > > [ 53.669907] BUG: assuming non migratable context at kernel/printk/printk_safe.c:23 > > [ 53.669920] in_atomic(): 0, irqs_disabled(): 0, migration_disabled() 0 pid: 1637, name: bash > > [ 53.669931] 2 locks held by bash/1637: > > [ 53.669936] #0: ffff8ae680a384a8 (sb_writers#4){.+.+}-{0:0}, at: ksys_write+0x6e/0xf0 > > [ 53.669968] #1: ffffffff83f226e0 (rcu_read_lock){....}-{1:3}, at: __handle_sysrq+0x3d/0x120 > > [ 53.670002] CPU: 2 UID: 0 PID: 1637 Comm: bash Not tainted 6.12.0-rc3-default+ #67 > > [ 53.670011] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.16.3-2-gc13ff2cd-prebuilt.qemu.org 04/01/2014 > > [ 53.670020] Call Trace: > > [ 53.670026] <TASK> > > [ 53.670045] dump_stack_lvl+0x6c/0xa0 > > [ 53.670064] __cant_migrate.cold+0x7c/0x89 > > [ 53.670080] printk_loud_console_enter+0x15/0x30 > > [ 53.670088] __handle_sysrq+0x60/0x120 > > [ 53.670104] write_sysrq_trigger+0x6a/0xa0 > > [ 53.670120] proc_reg_write+0x5f/0xb0 > > [ 53.670132] vfs_write+0xf9/0x540 > > [ 53.670147] ? __lock_release.isra.0+0x1a6/0x2c0 > > [ 53.670172] ? do_user_addr_fault+0x38c/0x720 > > [ 53.670197] ksys_write+0x6e/0xf0 > > [ 53.670220] do_syscall_64+0x79/0x190 > > [ 53.670238] entry_SYSCALL_64_after_hwframe+0x76/0x7e > > > > IMHO, the best solution would be to call migrate_disable()/enable() > > in printk_loud_console_enter()/exit(). > > That will not work because migrate_enable() can only be called from > can_sleep context. Instead, the migrate_disable()/enable() should be at > the few (one?) call sites where printk_loud_console_enter()/exit() is > used from task context. Hmm, if I get it correctly, we could not use migrate_disable() in __handle_sysrq() because it can be called also in atomic context, for example: + pl010_int() + pl010_rx_chars() + uart_handle_sysrq_char() + handle_sysrq() + __handle_sysrq() I do not see any easy way how to distinguish whether it was called in an atomic context or not. So, I see three possibilities: 1. Explicitly call preempt_disable() in __handle_sysrq(). It would be just around the the single line or the help. But still, I do not like it much. 2. Avoid the per-CPU variable. Force adding the LOUD_CON/FORCE_CON flag using a global variable, e.g. printk_force_console. The problem is that it might affect also messages printed by other CPUs. And there might be many. Well, console_loglevel is a global variable. The original code had a similar problem. 3. Add the LOUD_CON/FLUSH_CON flag via a parameter. For example, by a special LOGLEVEL_FORCE_CON, similar to LOGLEVEL_SCHED. I might work well for __handle_sysrq() which calls the affected printk() directly. But it won't work, for example, for kdb_show_stack(). It wants to show messages printed by a nested functions. I personally prefer the 2nd variant. It fixes the problem and it should not make things worse. Best Regards, Petr