On Tue, 17 Dec 2024 14:20:33 +0000 Alessandro Carminati <acarmina@xxxxxxxxxx> wrote: > Address a bug in the kernel that triggers a "sleeping function called from > invalid context" warning when /sys/kernel/debug/kmemleak is printed under > specific conditions: > - CONFIG_PREEMPT_RT=y > - Set SELinux as the LSM for the system > - Set kptr_restrict to 1 > - kmemleak buffer contains at least one item > > BUG: sleeping function called from invalid context at kernel/locking/spinlock_rt.c:48 > in_atomic(): 1, irqs_disabled(): 1, non_block: 0, pid: 136, name: cat -rt is a bit annoying this way. Things which we expect to work OK are no longer doing so. > preempt_count: 1, expected: 0 > RCU nest depth: 2, expected: 2 > 6 locks held by cat/136: > #0: ffff32e64bcbf950 (&p->lock){+.+.}-{3:3}, at: seq_read_iter+0xb8/0xe30 > #1: ffffafe6aaa9dea0 (scan_mutex){+.+.}-{3:3}, at: kmemleak_seq_start+0x34/0x128 > #3: ffff32e6546b1cd0 (&object->lock){....}-{2:2}, at: kmemleak_seq_show+0x3c/0x1e0 > #4: ffffafe6aa8d8560 (rcu_read_lock){....}-{1:2}, at: has_ns_capability_noaudit+0x8/0x1b0 > #5: ffffafe6aabbc0f8 (notif_lock){+.+.}-{2:2}, at: avc_compute_av+0xc4/0x3d0 > irq event stamp: 136660 > hardirqs last enabled at (136659): [<ffffafe6a80fd7a0>] _raw_spin_unlock_irqrestore+0xa8/0xd8 > hardirqs last disabled at (136660): [<ffffafe6a80fd85c>] _raw_spin_lock_irqsave+0x8c/0xb0 > softirqs last enabled at (0): [<ffffafe6a5d50b28>] copy_process+0x11d8/0x3df8 > softirqs last disabled at (0): [<0000000000000000>] 0x0 > Preemption disabled at: > [<ffffafe6a6598a4c>] kmemleak_seq_show+0x3c/0x1e0 > CPU: 1 UID: 0 PID: 136 Comm: cat Tainted: G E 6.11.0-rt7+ #34 > Tainted: [E]=UNSIGNED_MODULE > Hardware name: linux,dummy-virt (DT) > Call trace: > dump_backtrace+0xa0/0x128 > show_stack+0x1c/0x30 > dump_stack_lvl+0xe8/0x198 > dump_stack+0x18/0x20 > rt_spin_lock+0x8c/0x1a8 > avc_perm_nonode+0xa0/0x150 > cred_has_capability.isra.0+0x118/0x218 > selinux_capable+0x50/0x80 > security_capable+0x7c/0xd0 > has_ns_capability_noaudit+0x94/0x1b0 > has_capability_noaudit+0x20/0x30 > restricted_pointer+0x21c/0x4b0 > pointer+0x298/0x760 > vsnprintf+0x330/0xf70 > seq_printf+0x178/0x218 > print_unreferenced+0x1a4/0x2d0 > kmemleak_seq_show+0xd0/0x1e0 > seq_read_iter+0x354/0xe30 > seq_read+0x250/0x378 > full_proxy_read+0xd8/0x148 > vfs_read+0x190/0x918 > ksys_read+0xf0/0x1e0 > __arm64_sys_read+0x70/0xa8 > invoke_syscall.constprop.0+0xd4/0x1d8 > el0_svc+0x50/0x158 > el0t_64_sync+0x17c/0x180 > > %pS and %pK, in the same back trace line, are redundant, and %pS can void > %pK service in certain contexts. > > %pS alone already provides the necessary information, and if it cannot > resolve the symbol, it falls back to printing the raw address voiding > the original intent behind the %pK. > > Additionally, %pK requires a privilege check CAP_SYSLOG enforced through > the LSM, which can trigger a "sleeping function called from invalid > context" warning under RT_PREEMPT kernels when the check occurs in an > atomic context. This issue may also affect other LSMs. > > This change avoids the unnecessary privilege check and resolves the > sleeping function warning without any loss of information. > > Signed-off-by: Alessandro Carminati <acarmina@xxxxxxxxxx> I'm adding Fixes: 3a6f33d86baa ("mm/kmemleak: use %pK to display kernel pointers in backtrace") Cc: <stable@xxxxxxxxxxxxxxx> > --- a/mm/kmemleak.c > +++ b/mm/kmemleak.c > @@ -373,7 +373,7 @@ static void print_unreferenced(struct seq_file *seq, > > for (i = 0; i < nr_entries; i++) { > void *ptr = (void *)entries[i]; > - warn_or_seq_printf(seq, " [<%pK>] %pS\n", ptr, ptr); > + warn_or_seq_printf(seq, " %pS\n", ptr); > } > } Before 3a6f33d86baa we were still printing the address, with plain old %p. Should we restore that, or is %p always useless?