On 08/30/2018 06:11 AM, Dmitry Vyukov wrote: > On Wed, Aug 29, 2018 at 7:03 AM, 'Alexander Potapenko' via > syzkaller-bugs <syzkaller-bugs@xxxxxxxxxxxxxxxx> wrote: >> On Wed, Aug 29, 2018 at 3:46 PM Jan Kara <jack@xxxxxxx> wrote: >>> On Tue 28-08-18 08:30:02, syzbot wrote: >>>> Hello, >>>> >>>> syzbot found the following crash on: >>>> >>>> HEAD commit: 5b394b2ddf03 Linux 4.19-rc1 >>>> git tree: upstream >>>> console output: https://syzkaller.appspot.com/x/log.txt?x=14f4d8e1400000 >>>> kernel config: https://syzkaller.appspot.com/x/.config?x=49927b422dcf0b29 >>>> dashboard link: https://syzkaller.appspot.com/bug?extid=45a34334c61a8ecf661d >>>> compiler: gcc (GCC) 8.0.1 20180413 (experimental) >>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13127e5a400000 >>>> >>>> IMPORTANT: if you fix the bug, please add the following tag to the commit: >>>> Reported-by: syzbot+45a34334c61a8ecf661d@xxxxxxxxxxxxxxxxxxxxxxxxx >>>> >>>> IPv6: ADDRCONF(NETDEV_UP): veth1: link is not ready >>>> IPv6: ADDRCONF(NETDEV_CHANGE): veth1: link becomes ready >>>> IPv6: ADDRCONF(NETDEV_CHANGE): veth0: link becomes ready >>>> 8021q: adding VLAN 0 to HW filter on device team0 >>>> ================================================================== >>>> BUG: KASAN: stack-out-of-bounds in schedule_debug kernel/sched/core.c:3285 >>>> [inline] >>>> BUG: KASAN: stack-out-of-bounds in __schedule+0x1977/0x1df0 >>>> kernel/sched/core.c:3395 >>>> Read of size 8 at addr ffff8801ad090000 by task syz-executor0/4718 >>> >>> Weird, can you please help me decipher this? So here KASAN complains about >>> wrong memory access in the scheduler. > > This looks like a result of a previous bad silent memory corruption. > > The KASAN report says there is a stack out-of-bounds in scheduler. And > that if followed by slab corruption report in another task. > > fs/jbd2/transaction.c happens to be the first meaningful file in this > crash, and so that's where it is attributed to. > > Rerunning the reproducer several times can maybe give some better > glues, or maybe not, maybe they all will look equally puzzling. > > This part of the repro looks familiar: > > r1 = bpf$MAP_CREATE(0x0, &(0x7f0000002e40)={0x12, 0x0, 0x4, 0x6e, 0x0, > 0x1}, 0x68) > bpf$MAP_UPDATE_ELEM(0x2, &(0x7f0000000180)={r1, &(0x7f0000000000), > &(0x7f0000000140)}, 0x20) > > We had exactly such consequences of a bug in bpf map very recently, > but that was claimed to be fixed. Maybe not completely? > +bpf maintainers Looks like syzbot found this in Linus tree with HEAD commit 5b394b2ddf03 ("Linux 4.19-rc1") one day later net PR got merged via 050cdc6c9501 ("Merge git://git.kernel.org/pub/..."). This PR contained a couple of fixes I did on sockmap code during audit such as: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b845c898b2f1ea458d5453f0fa1da6e2dfce3bb4 Looking at the reproducer syzkaller found it contains: r1 = bpf$MAP_CREATE(0x0, &(0x7f0000002e40)={0x12, 0x0, 0x4, 0x6e, 0x0, 0x1}, 0x68) ^^^ So it found the crash with map type of sock hash and key size of 0x0 (which is invalid), where subsequent map update triggered the corruption. I just did a 'syz test' and it wasn't able to trigger the crash anymore. #syz fix: bpf, sockmap: fix sock_hash_alloc and reject zero-sized keys