Re: linux 5.14.3: free_user_ns causes NULL pointer dereference

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi!

Just wanted to let you know that I still get these on stock Fedora kernel 5.14.10 on the IBM blades. But it took 10 hours before the first server crashed. The other 4 still runs fine since 15 hours ago. So for me it seems more stable now, but that could just be a coincidence.

Best regards,

Rune

------------[ cut here ]------------
kernel BUG at mm/slub.c:321!
invalid opcode: 0000 [#1] SMP PTI
CPU: 22 PID: 1838853 Comm: python3 Not tainted 5.14.10-200.fc34.x86_64 #1
Hardware name: IBM BladeCenter HS22 -[7870TKN]-/68Y8161, BIOS -[P9E164CUS-1.28]- 04/17/2018
RIP: 0010:__slab_free+0x245/0x4a0
Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b 4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49 3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4>
RSP: 0018:ffffb71dcfd6fda0 EFLAGS: 00010246
RAX: ffff9c5480d35860 RBX: ffff9c5480d35800 RCX: ffff9c5480d35800
RDX: 00000000802a0029 RSI: ffffeb41da034d00 RDI: ffff9c4f00042800
RBP: ffffb71dcfd6fe50 R08: 0000000000000001 R09: ffffffff9210b6a5
R10: ffff9c548102e000 R11: 0000000062667658 R12: ffffeb41da034d00
R13: ffff9c4f00042800 R14: ffff9c5480d35800 R15: ffff9c5480d35800
FS:  00007f7a89765740(0000) GS:ffff9c6c1fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a7ad7f4b0 CR3: 0000000564af4002 CR4: 00000000000206e0
Call Trace:
 ? filename_lookup+0x135/0x1b0
 ? put_ucounts+0x65/0x70
 kfree+0x369/0x3c0
 put_ucounts+0x65/0x70
 put_cred_rcu+0x70/0xd0
 do_faccessat+0x113/0x240
 do_syscall_64+0x3b/0x90
 entry_SYSCALL_64_after_hwframe+0x44/0xae
RIP: 0033:0x7f7a899cc44b
Code: 77 05 c3 0f 1f 40 00 48 8b 15 29 1a 0d 00 f7 d8 64 89 02 48 c7 c0 ff ff ff ff c3 0f 1f 40 00 f3 0f 1e fa b8 15 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 05 c3 0f 1f 40 00 48 8b 15 f9 19 0>
RSP: 002b:00007ffd01fa9ce8 EFLAGS: 00000202 ORIG_RAX: 0000000000000015
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f7a899cc44b
RDX: 0000000000000000 RSI: 0000000000000001 RDI: 00007f79a96d6e10
RBP: 0000000000000001 R08: 0000000000000000 R09: 00007f7a7c1fb930
R10: 00007f79a96d6000 R11: 0000000000000202 R12: 00007ffd01fa9d00
R13: 0000000000000001 R14: 0000556841045c90 R15: 00000000ffffff9c
Modules linked in: rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache netfs nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_rejec>
---[ end trace 0a81b150eacde1d5 ]---
RIP: 0010:__slab_free+0x245/0x4a0
Code: 0f b6 5c 24 1b 44 8b 44 24 1c 48 89 44 24 08 48 8b 54 24 20 4c 8b 4c 24 28 e9 8a fe ff ff 41 f7 45 08 00 0d 21 00 75 98 eb 8d <0f> 0b 49 3b 54 24 28 0f 85 53 ff ff ff 49 8b 44 24 08 4>
RSP: 0018:ffffb71dcfd6fda0 EFLAGS: 00010246
RAX: ffff9c5480d35860 RBX: ffff9c5480d35800 RCX: ffff9c5480d35800
RDX: 00000000802a0029 RSI: ffffeb41da034d00 RDI: ffff9c4f00042800
RBP: ffffb71dcfd6fe50 R08: 0000000000000001 R09: ffffffff9210b6a5
R10: ffff9c548102e000 R11: 0000000062667658 R12: ffffeb41da034d00
R13: ffff9c4f00042800 R14: ffff9c5480d35800 R15: ffff9c5480d35800
FS:  00007f7a89765740(0000) GS:ffff9c6c1fc80000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f7a7ad7f4b0 CR3: 0000000564af4002 CR4: 00000000000206e0
------------[ cut here ]------------

On 04/10/2021 19:19, Eric W. Biederman wrote:
ebiederm@xxxxxxxxxxxx (Eric W. Biederman) writes:

Adding Rune Kleveland to the discussion as he also seems to have
reproduced the issue.

Alex and I have been starring at the code and the reports and this
bug is hiding well.  Here is what we have figured out so far.

Both the warning from free_user_ns calling dec_ucount that Jordan Glover
reported and the KASAN error that Yu Zhao has reported appear to have
the same cause.  Using a ucounts structure after it has been freed and
reallocated as something else.

I have just skimmed through the recent report from Rune Kleveland
and it appears also to be a use after free.  Especially since the
second failure in the log is slub complaining about trying to free
the ucounts data structure.

We looked through the users of put_ucounts and we don't see any obvious
buggy users that would be freeing the data structure early.

Alex has tried to reproduce this so far is not having any luck.
Folks can you tell what compiler versions you are using and share your
kernel config with us?  That might help.

The little debug diff below is my guess of what is happening.  If the
folks who can reproduce this issue can try the patch below and let me
know if the warnings fire that would be appreciated.  It is still not
enough to track down the bug but at least it will confirm my current
hypothesis about how things look before there is a use of memory after
it is freed.
Bah.  Scratch that test patch.  I just double checked myself and
cred->ucounts and cred->user_ns->ucounts should never be equal,
as the user namespace is counted in it's parent user namespace.

That observation now tells me I have a parent user namespace that went
corrupt.

Back to the drawing board.


Thank you,
Eric

diff --git a/kernel/cred.c b/kernel/cred.c
index f784e08c2fbd..e7ffaa3cf5a6 100644
--- a/kernel/cred.c
+++ b/kernel/cred.c
@@ -120,6 +120,12 @@ static void put_cred_rcu(struct rcu_head *rcu)
  	if (cred->group_info)
  		put_group_info(cred->group_info);
  	free_uid(cred->user);
+#if 1
+	if ((cred->ucounts == cred->user_ns->ucounts) &&
+	    (atomic_read(&cred->ucounts->count) == 1)) {
+		WARN_ONCE(1, "put_cred_rcu: ucount count 1\n");
+	}
+#endif
  	if (cred->ucounts)
  		put_ucounts(cred->ucounts);
  	put_user_ns(cred->user_ns);
diff --git a/kernel/exit.c b/kernel/exit.c
index 91a43e57a32e..60fd88b34c1a 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -743,6 +743,13 @@ void __noreturn do_exit(long code)
  	if (unlikely(!tsk->pid))
  		panic("Attempted to kill the idle task!");
+#if 1
+	if ((tsk->cred->ucounts == tsk->cred->user_ns->ucounts) &&
+	    (atomic_read(tsk->cred->ucounts->count) == 1)) {
+		WARN_ONCE(1, "do_exit: ucount count 1\n");
+	}
+#endif
+
  	/*
  	 * If do_exit is called because this processes oopsed, it's possible
  	 * that get_fs() was left as KERNEL_DS, so reset it to USER_DS before




[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux