Re: [PATCH v2] rbtree: fix the red root

Douglas Gilbert <dgilbert@xxxxxxxxxxxx> · Sun, 13 Jan 2019 23:52:06 -0500

On 2019-01-13 10:59 p.m., Esme wrote:
‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Sunday, January 13, 2019 10:52 PM, Douglas Gilbert <dgilbert@xxxxxxxxxxxx> wrote:

On 2019-01-13 10:07 p.m., Esme wrote:

‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐
On Sunday, January 13, 2019 9:33 PM, Qian Cai cai@xxxxxx wrote:

On 1/13/19 9:20 PM, David Lechner wrote:

On 1/11/19 8:58 PM, Michel Lespinasse wrote:

On Fri, Jan 11, 2019 at 3:47 PM David Lechner david@xxxxxxxxxxxxxx wrote:

On 1/11/19 2:58 PM, Qian Cai wrote:

A GPF was reported,
kasan: CONFIG_KASAN_INLINE enabled
kasan: GPF could be caused by NULL-ptr deref or user memory access
general protection fault: 0000 [#1] SMP KASAN
           kasan_die_handler.cold.22+0x11/0x31
           notifier_call_chain+0x17b/0x390
           atomic_notifier_call_chain+0xa7/0x1b0
           notify_die+0x1be/0x2e0
           do_general_protection+0x13e/0x330
           general_protection+0x1e/0x30
           rb_insert_color+0x189/0x1480
           create_object+0x785/0xca0
           kmemleak_alloc+0x2f/0x50
           kmem_cache_alloc+0x1b9/0x3c0
           getname_flags+0xdb/0x5d0
           getname+0x1e/0x20
           do_sys_open+0x3a1/0x7d0
           __x64_sys_open+0x7e/0xc0
           do_syscall_64+0x1b3/0x820
           entry_SYSCALL_64_after_hwframe+0x49/0xbe
It turned out,
gparent = rb_red_parent(parent);
tmp = gparent->rb_right; <-- GPF was triggered here.
Apparently, "gparent" is NULL which indicates "parent" is rbtree's root
which is red. Otherwise, it will be treated properly a few lines above.
/*
    * If there is a black parent, we are done.
    * Otherwise, take some corrective action as,
    * per 4), we don't want a red root or two
    * consecutive red nodes.
    */
if(rb_is_black(parent))
        break;
Hence, it violates the rule #1 (the root can't be red) and need a fix
up, and also add a regression test for it. This looks like was
introduced by 6d58452dc06 where it no longer always paint the root as
black.
Fixes: 6d58452dc06 (rbtree: adjust root color in rb_insert_color() only
when necessary)
Reported-by: Esme esploit@xxxxxxxxxxxxx
Tested-by: Joey Pabalinas joeypabalinas@xxxxxxxxx
Signed-off-by: Qian Cai cai@xxxxxx

Tested-by: David Lechner david@xxxxxxxxxxxxxx
FWIW, this fixed the following crash for me:
Unable to handle kernel NULL pointer dereference at virtual address 00000004

Just to clarify, do you have a way to reproduce this crash without the fix ?

I am starting to suspect that my crash was caused by some new code
in the drm-misc-next tree that might be causing a memory corruption.
It threw me off that the stack trace didn't contain anything related
to drm.
See: https://patchwork.freedesktop.org/patch/276719/

It may be useful for those who could reproduce this issue to turn on those
memory corruption debug options to narrow down a bit.
CONFIG_DEBUG_PAGEALLOC=y
CONFIG_DEBUG_PAGEALLOC_ENABLE_DEFAULT=y
CONFIG_KASAN=y
CONFIG_KASAN_GENERIC=y
CONFIG_SLUB_DEBUG_ON=y

I have been on SLAB, I configured SLAB DEBUG with a fresh pull from github. Linux syzkaller 5.0.0-rc2 #9 SMP Sun Jan 13 21:57:40 EST 2019 x86_64
...
In an effort to get a different stack into the kernel, I felt that nothing works better than fork bomb? :)
Let me know if that helps.
root@syzkaller:~# gcc -o test3 test3.c
root@syzkaller:~# while : ; do ./test3 & done

And is test3 the same multi-threaded program that enters the kernel via
/dev/sg0 and then calls SCSI_IOCTL_SEND_COMMAND which goes to the SCSI
mid-level and thence to the block layer?

And please remind me, does it also fail on lk 4.20.2 ?

Doug Gilbert

Yes, the same C repro from the earlier thread.  It was a 4.20.0 kernel where it was first detected.  I can move to 4.20.2 and see if that changes anything.

Hi,
I don't think there is any need to check lk 4.20.2 (as it would
be very surprising if it didn't also have this "feature").

More interesting might be: has "test3" been run on lk 4.19 or
any earlier kernel?

Doug Gilbert