On Tue, Jul 10, 2018 at 12:07 AM, Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote: > On Mon, Jul 09, 2018 at 07:23:15PM +0200, Dmitry Vyukov wrote: >> On Mon, Jul 9, 2018 at 5:25 PM, Kirill A. Shutemov <kirill@xxxxxxxxxxxxx> wrote: >> > On Mon, Jul 09, 2018 at 05:21:55PM +0300, Kirill A. Shutemov wrote: >> >> > This also happened only once so far: >> >> > https://syzkaller.appspot.com/bug?extid=3f84280d52be9b7083cc >> >> > and I can't reproduce it rerunning this program. So it's either a very >> >> > subtle race, or fd in the middle of netlink address magically matched >> >> > some fd once, or something else... >> >> >> >> Okay, I've got it reproduced. See below. >> >> >> >> The problem is that kcov doesn't set vm_ops for the VMA and it makes >> >> kernel think that the VMA is anonymous. >> >> >> >> It's not necessary the way it was triggered by syzkaller. I just found >> >> that kcov's ->mmap doesn't set vm_ops. There can more such cases. >> >> vma_is_anonymous() is what we need to fix. >> >> >> >> ( Although, I found logic around mmaping the file second time questinable >> >> at best. It seems broken to me. ) >> >> >> >> It is known that vma_is_anonymous() can produce false-positives. It tried >> >> to fix it once[1], but it back-fired[2]. >> >> >> >> I'll look at this again. >> > >> > Below is a patch that seems work. But it definately requires more testing. >> > >> > Dmitry, could you give it a try in syzkaller? >> >> Trying. >> >> Not sure what you expect from this. Either way it will be hundreds of >> crashes before vs hundreds of crashes after ;) >> >> But one that started popping up is this, looks like it's somewhere >> around the code your patch touches: >> >> kasan: CONFIG_KASAN_INLINE enabled >> kasan: GPF could be caused by NULL-ptr deref or user memory access >> general protection fault: 0000 [#1] SMP KASAN >> CPU: 0 PID: 6711 Comm: syz-executor3 Not tainted 4.18.0-rc4+ #43 >> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-1 04/01/2014 >> RIP: 0010:__get_vma_policy+0x61/0x160 mm/mempolicy.c:1620 > > Right, my bad. Here's fixup. > > diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c > index d508c7844681..12b2b3c7f51e 100644 > --- a/fs/hugetlbfs/inode.c > +++ b/fs/hugetlbfs/inode.c > @@ -597,6 +597,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, > memset(&pseudo_vma, 0, sizeof(struct vm_area_struct)); > pseudo_vma.vm_flags = (VM_HUGETLB | VM_MAYSHARE | VM_SHARED); > pseudo_vma.vm_file = file; > + pseudo_vma.vm_ops = &anon_vm_ops; > > for (index = start; index < end; index++) { > /* With this change I don't see anything that stands out, just a typical mix of crashes like these: BUG: unable to handle kernel paging request in kfree INFO: task hung in flush_work KASAN: slab-out-of-bounds Read in fscache_alloc_cookie KASAN: use-after-free Read in __queue_work general protection fault in encode_rpcb_string lost connection to test machine no output from test machine unregister_netdevice: waiting for DEV to become free So I guess this can be qualified as +1 for the patch.