Re: [Bugme-new] [Bug 11391] New: Kernel NULL pointer dereference in do_notify_parent()

Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> · Thu, 21 Aug 2008 09:23:22 -0700

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 21 Aug 2008 05:58:52 -0700 (PDT) bugme-daemon@xxxxxxxxxxxxxxxxxxx wrote:

> http://bugzilla.kernel.org/show_bug.cgi?id=11391
> 
>            Summary: Kernel NULL pointer dereference in do_notify_parent()
>            Product: Process Management
>            Version: 2.5
>      KernelVersion: 2.6.26.3

Should have been 2.6.26.4?

>           Platform: All
>         OS/Version: Linux
>               Tree: Mainline
>             Status: NEW
>           Severity: normal
>           Priority: P1
>          Component: Other
>         AssignedTo: process_other@xxxxxxxxxxxxxxxxxxxx
>         ReportedBy: robert.rex@xxxxxxxxxx
> 
> 
> Latest working kernel version: 2.6.26.3
> 
> Earliest failing kernel version: 2.6.25.4 (didn't test with former kernels)

Appears to be a regression in -stable.   Did any namespacy things go into
2.6.26.4?

> Distribution: CentOS 5.1 (with Vanilla kernel from kernel.org)
> 
> Hardware Environment: several x86_64 plattforms (AMD Opteron, Intel Xeon)
> 
> Problem Description:
> -------------------------------------
> BUG: unable to handle kernel NULL pointer dereference at virtual address
> 0000000000000020
> IP: [<ffffffff8023d5d0>] do_notify_parent+0x66/0x194
> PGD 0
> Oops: 0000 [1] SMP
> CPU 1
> Modules linked in: ipv6 autofs4 hidp rfcomm l2cap bluetooth sunrpc dm_mirror
> dm_
> log dm_multipath dm_mod video output sbs sbshc battery acpi_memhotplug ac lp sg
> floppy button tg3 serio_raw parport_pc parport k8temp hwmon i2c_amd756
> i2c_amd81
> 11 i2c_core amd_rng shpchp pcspkr usb_storage 3w_9xxx sata_sil libata sd_mod
> scs
> i_mod raid456 async_xor async_memcpy async_tx xor ext3 jbd ehci_hcd ohci_hcd
> uhc
> i_hcd
> Pid: 3800, comm: sshd Not tainted 2.6.26.3 #1
> RIP: 0010 [<ffffffff8023d5d0>]  [<ffffffff8023d5d0>]
> do_notify_parent+0x66/0x194
> RSP: 0018:ffff8101fd943c78  EFLAGS: 00010046
> RAX: 0000000000000000 RBX: ffff8101fe08f2f0 RCX: ffff8101fd956870
> RDX: ffff8101fe08f4c0 RSI: 0000000000000011 RDI: ffff8101fe08f2f0
> RBP: 0000000000000000 R08: 0000000000000009 R09: 0000000000000009
> R10: 0000000000000002 R11: ffffffff802f1c0e R12: 0000000000000011
> R13: ffff8101fe4e00c0 R14: 0000000000000000 R15: 0000000000000001
> FS:  00007fce4b4b2710(0000) GS:ffff8101ff08c8c0(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> CR2: 0000000000000020 CR3: 0000000000201000 CR4: 00000000000006e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> Process sshd (pid: 3800, threadinfo ffff8101fd942000, task ffff8101fe4e00d0)
> Stack:  0000000000000011 ffff8101fec76630 ffff8101fe0e1180 ffffffff8029d597
>  0000000000000008 ffff8101fe0e1180 ffff8101fe7e87c0 ffffffff802a1915
>  ffff8101fd856c40 ffff8101fe0e1180 ffff8101fd856c40 0000000000000000
> Call Trace:
> [<ffffffff8029d597>] dput+0x26/0xe7
> [<ffffffff802a1915>] mntput_no_expire+0x20/0x119
> [<ffffffff8028b557>] filp_close+0x5d/0x65
> [<ffffffff80233cd1>] reparent_thread+0x139/0x14d
> [<ffffffff802350ba>] do_exit+0x39a/0x68c
> [<ffffffff80235412>] do_group_exit+0x66/0x96
> [<ffffffff8023d4f7>] get_signal_to_deliver+0x2ea/0x305
> [<ffffffff8020b166>] do_notify_resume+0xaf/0x7de
> [<ffffffff802435de>] autoremove_wake_function+0x0/0x2e
> [<ffffffff80236198>] current_fd_time+0x1e/0x24
> [<ffffffff8036dfdb>] tty_ldisc_deref+0x62/0x75
> [<ffffffff8025bdfe>] autit_syscall_exit+0x2e4/0x303
> [<ffffffff8020bf8c>] int_signal+x012/0x17
> 
> Code: 00 48 39 87 30 02 00 00 74 04 0f 0b eb fe 44 89 24 24 c7 44 24 04 00 00
> 00
>  00 48 8b 83 b8 01 00 00 48 89 df 48 8b 80 98 04 00 00 <48> 8b 70 20 e8 57 39
> 00
>  00 48 8b 93 a0 04 00 00 89 44 24 10 8b
> RIP  [<ffffffff8023d5c9>] do_notify_parent+0x66/0x194
>  RSP <ffff8101f7535c78>
> CR2: 0000000000000020
> ---[ end trace 8df15d3ad47033c0 ]---
> Fixing recursive fault but reboot is needed!
> -------------------------------------
> 
> Problem happens with PID namespaces enabled. After killing the child reaper of
> a new namespace with SIGKILL, the kernel crashes. I did some debugging and as
> far as I could see, the NULL pointer dereference happens on this line:
> 
> info.si_pid = task_pid_nr_ns(tsk, tsk->parent->nsproxy->pid_ns);
> 
> I did a BUG_ON(!tsk->parent->nsproxy) one line above and got an appropriate
> message before the kernel crashed.
> 
> Software Environment:
> (test program attached)
> 
> Steps to reproduce:
> 
> Compile the attached test program with "gcc -o ns_exec ns_exec.c -lpthread".
> After being started, it will create a new PID namespace, mount a proc
> filesystem herein, create a new thread and fork() into an SSHd.
> Login via SSH (the port of the started SSHd is hardcoded in the test program,
> so you'll have to modify it appropriately if you wish to do so ;-) ). Do a
> "kill -9 1". On my machines, the kernel crashed in over 90% of all tests.
> 
> 

_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/containers