On 06/11/2017 20:04, Dmitry Vyukov wrote: > On Mon, Nov 6, 2017 at 5:14 PM, Paolo Bonzini <pbonzini@xxxxxxxxxx> wrote: >> On 06/11/2017 17:01, David Hildenbrand wrote: >>> On 06.11.2017 16:10, Nick Desaulniers wrote: >>>> Does it have to be stack allocated? >>> >>> We can't use kmalloc and friends in emulate.c. We would have to >>> introduce new emulator callbacks. >>> >>> a) for malloc and free. hmmm. >>> b) for carrying out the fxrstr/fixup. >>> >>> Paolo, what do you suggest? >> >> You can use kmalloc. Any userspace user of emulate.c would have to >> write a wrapper. > > > Can you please tell me more about this? Is it used for testing? Is > there an example code that builds and tests this in user-space? Not quite, there's no user outside KVM yet. But the emulator code is designed to be independent from KVM's memory access primitives; with "nm" you can see how there are very few undefined symbols: U ex_handler_default U find_first_bit U find_next_bit U memcpy U printk Exceptions are only used for div/idiv, if it gets in the way it's okay to just revert commit b8c0b6ae498f ("KVM: x86 emulator: convert DIV/IDIV to fastop", 2013-05-21). On the other hand, dependencies on Linux headers have sneaked in more and more, but refactoring those away should not be too hard. Thanks, Paolo > > >> But I'm not sure it's useful... maybe the >> asm_safe+memcpy could be moved to a separate noinline function, so that >> segmented_read_std is invoked with a leaner stack. >> >> Paolo >> >>>> >>>> On Nov 6, 2017 3:52 AM, "David Hildenbrand" <david@xxxxxxxxxx >>>> <mailto:david@xxxxxxxxxx>> wrote: >>>> >>>> On 31.10.2017 12:34, syzbot wrote: >>>> > Hello, >>>> > >>>> > syzkaller hit the following crash on >>>> > 91dfed74eabcdae9378131546c446442c29bf769 >>>> > >>>> git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master >>>> <http://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/master> >>>> > compiler: gcc (GCC) 7.1.1 20170620 >>>> > .config is attached >>>> > Raw console output is attached. >>>> > C reproducer is attached >>>> > syzkaller reproducer is attached. See https://goo.gl/kgGztJ >>>> > for information about syzkaller reproducers >>>> > >>>> > >>>> > in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109 >>>> > 2 locks held by syzkaller879109/2909: >>>> > #0: (&vcpu->mutex){+.+.}, at: [<ffffffff8106222c>] >>>> vcpu_load+0x1c/0x70 >>>> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:154 >>>> > #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_enter_guest >>>> > arch/x86/kvm/x86.c:6983 [inline] >>>> > #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_run >>>> > arch/x86/kvm/x86.c:7061 [inline] >>>> > #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] >>>> > kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222 >>>> > CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted >>>> 4.13.0-rc4-next-20170811 >>>> > #1 >>>> > Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs >>>> 01/01/2011 >>>> > Call Trace: >>>> > __dump_stack lib/dump_stack.c:16 [inline] >>>> > dump_stack+0x194/0x257 lib/dump_stack.c:52 >>>> > ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014 >>>> > __might_sleep+0x95/0x190 kernel/sched/core.c:5967 >>>> > __might_fault+0xab/0x1d0 mm/memory.c:4383 >>>> > __copy_from_user include/linux/uaccess.h:71 [inline] >>>> > __kvm_read_guest_page+0x58/0xa0 >>>> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771 >>>> > kvm_vcpu_read_guest_page+0x44/0x60 >>>> > arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791 >>>> > kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407 >>>> > kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466 >>>> > segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819 >>>> > em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022 >>>> >>>> >>>> In em_fxrstor, we do a get_fpu(), which in return disables preemption. >>>> >>>> With preempt_disable(), we do a >>>> segmented_read_std()->kvm_vcpu_read_guest_page(), triggering the >>>> warning. >>>> >>>> > x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471 >>>> > x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698 >>>> > kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854 >>>> > handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400 >>>> > vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718 >>>> > vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline] >>>> > vcpu_run arch/x86/kvm/x86.c:7061 [inline] >>>> > kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222 >>>> > kvm_vcpu_ioctl+0x64c/0x1010 >>>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591 >>>> > vfs_ioctl fs/ioctl.c:45 [inline] >>>> > do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 >>>> > SYSC_ioctl fs/ioctl.c:700 [inline] >>>> > SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 >>>> > entry_SYSCALL_64_fastpath+0x1f/0xbe >>>> >>>> I don't really see a way to avoid two fxstate variables. Unloading the >>>> fpu in between the fxstore/fxrstr could lead to host values getting >>>> overwritten. Loading/saving the fpu in kvm_arch_vcpu_ioctl_run() would >>>> most probably also not work, as the relevant portions of fxregs_state >>>> would not get saved/restored. So the preemption would still be needed. >>>> >>>> >>>> So all I can offer for now is the following (untested, can send as >>>> proper patch if needed): >>>> >>>> >>>> From f32d06c8c621c6d68e073e9bdb81a6280b6c9544 Mon Sep 17 00:00:00 2001 >>>> From: David Hildenbrand <david@xxxxxxxxxx <mailto:david@xxxxxxxxxx>> >>>> Date: Mon, 6 Nov 2017 12:35:39 +0100 >>>> Subject: [PATCH v1] KVM: x86: fix em_fxstor sleeping while in atomic >>>> >>>> Commit 9d643f63128b tried to optimize the stack size, but introduced a >>>> guest memory access which might sleep while in atomic. >>>> >>>> Let's undo that part of the commit but keep the cleanups. >>>> >>>> Reported by syzbot: >>>> >>>> in_atomic(): 1, irqs_disabled(): 0, pid: 2909, name: syzkaller879109 >>>> 2 locks held by syzkaller879109/2909: >>>> #0: (&vcpu->mutex){+.+.}, at: [<ffffffff8106222c>] >>>> vcpu_load+0x1c/0x70 >>>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:154 >>>> #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_enter_guest >>>> arch/x86/kvm/x86.c:6983 [inline] >>>> #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] vcpu_run >>>> arch/x86/kvm/x86.c:7061 [inline] >>>> #1: (&kvm->srcu){....}, at: [<ffffffff810dd162>] >>>> kvm_arch_vcpu_ioctl_run+0x1bc2/0x58b0 arch/x86/kvm/x86.c:7222 >>>> CPU: 1 PID: 2909 Comm: syzkaller879109 Not tainted >>>> 4.13.0-rc4-next-20170811 >>>> Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs >>>> 01/01/2011 >>>> Call Trace: >>>> __dump_stack lib/dump_stack.c:16 [inline] >>>> dump_stack+0x194/0x257 lib/dump_stack.c:52 >>>> ___might_sleep+0x2b2/0x470 kernel/sched/core.c:6014 >>>> __might_sleep+0x95/0x190 kernel/sched/core.c:5967 >>>> __might_fault+0xab/0x1d0 mm/memory.c:4383 >>>> __copy_from_user include/linux/uaccess.h:71 [inline] >>>> __kvm_read_guest_page+0x58/0xa0 >>>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1771 >>>> kvm_vcpu_read_guest_page+0x44/0x60 >>>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:1791 >>>> kvm_read_guest_virt_helper+0x76/0x140 arch/x86/kvm/x86.c:4407 >>>> kvm_read_guest_virt_system+0x3c/0x50 arch/x86/kvm/x86.c:4466 >>>> segmented_read_std+0x10c/0x180 arch/x86/kvm/emulate.c:819 >>>> em_fxrstor+0x27b/0x410 arch/x86/kvm/emulate.c:4022 >>>> x86_emulate_insn+0x55d/0x3c50 arch/x86/kvm/emulate.c:5471 >>>> x86_emulate_instruction+0x411/0x1ca0 arch/x86/kvm/x86.c:5698 >>>> kvm_mmu_page_fault+0x18b/0x2c0 arch/x86/kvm/mmu.c:4854 >>>> handle_ept_violation+0x1fc/0x5e0 arch/x86/kvm/vmx.c:6400 >>>> vmx_handle_exit+0x281/0x1ab0 arch/x86/kvm/vmx.c:8718 >>>> vcpu_enter_guest arch/x86/kvm/x86.c:6999 [inline] >>>> vcpu_run arch/x86/kvm/x86.c:7061 [inline] >>>> kvm_arch_vcpu_ioctl_run+0x1cee/0x58b0 arch/x86/kvm/x86.c:7222 >>>> kvm_vcpu_ioctl+0x64c/0x1010 >>>> arch/x86/kvm/../../../virt/kvm/kvm_main.c:2591 >>>> vfs_ioctl fs/ioctl.c:45 [inline] >>>> do_vfs_ioctl+0x1b1/0x1520 fs/ioctl.c:685 >>>> SYSC_ioctl fs/ioctl.c:700 [inline] >>>> SyS_ioctl+0x8f/0xc0 fs/ioctl.c:691 >>>> entry_SYSCALL_64_fastpath+0x1f/0xbe >>>> RIP: 0033:0x437fc9 >>>> RSP: 002b:00007ffc7b4d5ab8 EFLAGS: 00000206 ORIG_RAX: 0000000000000010 >>>> RAX: ffffffffffffffda RBX: 00000000004002b0 RCX: 0000000000437fc9 >>>> RDX: 0000000000000000 RSI: 000000000000ae80 RDI: 0000000000000005 >>>> RBP: 0000000000000086 R08: 0000000000000000 R09: 0000000020ae8000 >>>> R10: 0000000000009120 R11: 0000000000000206 R12: 0000000000000000 >>>> R13: 0000000000000004 R14: 0000000000000004 R15: 0000000020077000 >>>> >>>> Fixes: 9d643f63128b ("KVM: x86: avoid large stack allocations in >>>> em_fxrstor") >>>> Signed-off-by: David Hildenbrand <david@xxxxxxxxxx >>>> <mailto:david@xxxxxxxxxx>> >>>> --- >>>> arch/x86/kvm/emulate.c | 16 +++++++++------- >>>> 1 file changed, 9 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/arch/x86/kvm/emulate.c b/arch/x86/kvm/emulate.c >>>> index fb0055953fbc..d87f01a2d6f4 100644 >>>> --- a/arch/x86/kvm/emulate.c >>>> +++ b/arch/x86/kvm/emulate.c >>>> @@ -4002,7 +4002,7 @@ static int em_fxsave(struct x86_emulate_ctxt >>>> *ctxt) >>>> >>>> static int em_fxrstor(struct x86_emulate_ctxt *ctxt) >>>> { >>>> - struct fxregs_state fx_state; >>>> + struct fxregs_state fx_state, fx_old; >>>> int rc; >>>> size_t size; >>>> >>>> @@ -4010,19 +4010,21 @@ static int em_fxrstor(struct >>>> x86_emulate_ctxt *ctxt) >>>> if (rc != X86EMUL_CONTINUE) >>>> return rc; >>>> >>>> + size = fxstate_size(ctxt); >>>> + rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, >>>> &fx_state, size); >>>> + if (rc != X86EMUL_CONTINUE) >>>> + return rc; >>>> + >>>> ctxt->ops->get_fpu(ctxt); >>>> >>>> - size = fxstate_size(ctxt); >>>> if (size < __fxstate_size(16)) { >>>> - rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_state)); >>>> + rc = asm_safe("fxsave %[fx]", , [fx] "+m"(fx_old)); >>>> if (rc != X86EMUL_CONTINUE) >>>> goto out; >>>> + memcpy(((void *)&fx_state) + size, ((void *)&fx_old) >>>> + size, >>>> + __fxstate_size(16) - size); >>>> } >>>> >>>> - rc = segmented_read_std(ctxt, ctxt->memop.addr.mem, >>>> &fx_state, size); >>>> - if (rc != X86EMUL_CONTINUE) >>>> - goto out; >>>> - >>>> if (fx_state.mxcsr >> 16) { >>>> rc = emulate_gp(ctxt, 0); >>>> goto out; >>>> -- >>>> 2.13.6 >>>> >>>> >>>> >>>> -- >>>> >>>> Thanks, >>>> >>>> David / dhildenb >>>> >>> >>> >> >> -- >> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group. >> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@xxxxxxxxxxxxxxxx. >> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/fc4fec6a-13fc-e5f4-e78a-711f2fccec92%40redhat.com. >> For more options, visit https://groups.google.com/d/optout.