2017-12-13 17:42 GMT+08:00 Paolo Bonzini <pbonzini@xxxxxxxxxx>: > On 13/12/2017 10:18, David Hildenbrand wrote: >> On 13.12.2017 04:10, Wanpeng Li wrote: >>> From: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >>> >>> Reported by syzkaller: >>> >>> WARNING: CPU: 0 PID: 12927 at arch/x86/kernel/traps.c:780 do_debug+0x222/0x250 >>> CPU: 0 PID: 12927 Comm: syz-executor Tainted: G OE 4.15.0-rc2+ #16 >>> RIP: 0010:do_debug+0x222/0x250 >>> Call Trace: >>> <#DB> >>> debug+0x3e/0x70 >>> RIP: 0010:copy_user_enhanced_fast_string+0x10/0x20 >>> </#DB> >>> _copy_from_user+0x5b/0x90 >>> SyS_timer_create+0x33/0x80 >>> entry_SYSCALL_64_fastpath+0x23/0x9a >>> >>> The syzkaller will mmap a buffer which is also the struct sigevent parameter of >>> timer_create(), it will also call perf_event_open() to set a BP for the buffer, >>> so when the implementation of timer_create() in kernel tries to get the struct >>> sigevent parameter by copy_from_user(), rep movsb triggers the BP. The syzkaller >>> testcase also sets the debug registers for the guest, however, the kvm just >>> restores host debug registers when we have active breakpoints. I can observe >>> the dr6 single step bit is set and !hw_breakpoint_active() sporadically by print >>> when running the testcase heavy multithreading. The do_debug() which is triggered >>> by rep movsb will splash when (dr6 & DR_STEP && !user_mode(regs)). >>> >>> This patch fixes it by restoring host dr6 in sched_out if no breakpoint is active. >>> >>> Reported-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx> >>> Cc: Paolo Bonzini <pbonzini@xxxxxxxxxx> >>> Cc: Radim Krčmář <rkrcmar@xxxxxxxxxx> >>> Cc: David Hildenbrand <david@xxxxxxxxxx> >>> Cc: Dmitry Vyukov <dvyukov@xxxxxxxxxx> >>> Reviewed-by: David Hildenbrand <david@xxxxxxxxxx> >>> Signed-off-by: Wanpeng Li <wanpeng.li@xxxxxxxxxxx> >>> --- >>> v1 -> v2: >>> * move to sched_out path >>> >>> arch/x86/kvm/x86.c | 2 ++ >>> 1 file changed, 2 insertions(+) >>> >>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c >>> index 1c5c7a3..76886c4 100644 >>> --- a/arch/x86/kvm/x86.c >>> +++ b/arch/x86/kvm/x86.c >>> @@ -2964,6 +2964,8 @@ void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu) >>> pagefault_enable(); >>> kvm_x86_ops->vcpu_put(vcpu); >>> vcpu->arch.last_host_tsc = rdtsc(); >> >> Can you add a comment like >> >> /* With active breakpoints we already restored all debugregs in >> vcpu_enter_guest(), however without active breakpoints we have to >> restore debugreg 6 before scheduled out. >> */ > > Actually, we should make it unconditionally zero, not reset it to > current->thread.debugreg6. That's because the invariant at exit from > do_debug is DR6 = 0. > > /* > * do_debug expects dr6 to be cleared after it runs, but here > * we might have a stale dr6 from the guest. > */ > set_debugreg(0, 6); > > I'll push the patch to kvm/queue. Do you need I to send a new version? Regards, Wanpeng Li