[Bug 53681] New: nVMX: Rare crash on shadow-on-shadow case

bugzilla-daemon@xxxxxxxxxxxxxxxxxxx · Tue, 12 Feb 2013 08:24:20 +0000 (UTC)

https://bugzilla.kernel.org/show_bug.cgi?id=53681

           Summary: nVMX: Rare crash on shadow-on-shadow case
           Product: Virtualization
           Version: unspecified
          Platform: All
        OS/Version: Linux
              Tree: Mainline
            Status: NEW
          Severity: low
          Priority: P1
         Component: kvm
        AssignedTo: virtualization_kvm@xxxxxxxxxxxxxxxxxxxx
        ReportedBy: nyh@xxxxxxxxxxxxxxxxxxx
        Regression: No

I tried (using an April 2011 codebase, so this bug needs to be verified again!)
the following stress test of nested VMX: L0 and L1 are KVM, L0, L1 and L2 are
Ubuntu. L0 has 16 hardware threads and runs parallel compilation ("make -j16")
in a loop. L1 and L2 get one vcpu, and run "make -j3". This test is especially
heavy on context-switches (which happen on all levels) and memory management
(as all the separate processes have their separate page tables).

With the default nested mmu virtualization, shadow-on-EPT, things appear to
work fine, and this stress test happily continues for 24 hours without
incident.

However, with the non-recommended, slower, shadow-on-shadow (i.e., ept=0 in
L0), after a couple of hours of successful compilation, L0 suddenly died, with
the following oops:

BUG: unable to handle kernel NULL pointer dereference at 0000000000000030
IP: [<ffffffffa0015414>] mark_unsync+0x0/0x2a [kvm]
PGD 1746df067 PUD 174f39067 PMD 0 
Oops: 0000 [#1] SMP 
last sysfs file: /sys/devices/system/cpu/cpu9/cpufreq/scaling_governor
CPU 15 
Modules linked in: kvm_intel kvm [last unloaded: kvm]

Pid: 3353, comm: qemu-system-x86 Tainted: G    B       2.6.37mx-66117-gb966170
#
234 49Y6498     /IBM System x -[794692G]-
RIP: 0010:[<ffffffffa0015414>]  [<ffffffffa0015414>] mark_unsync+0x0/0x2a [kvm]
RSP: 0018:ffff880101131760  EFLAGS: 00010256
RAX: 0000000000000000 RBX: ffff880171ce87c0 RCX: 0000000000000001
RDX: 0000000000000001 RSI: ffff880000000ff7 RDI: 0000000000000000
RBP: ffff880101131798 R08: 0000000000000001 R09: 0000000000000001
R10: 0000000000000000 R11: ffffea0000000000 R12: 0000000000000008
R13: ffffea0000000000 R14: ffff880171ce8798 R15: ffff880000000ff7
FS:  00007fabf2b02910(0000) GS:ffff88007d5e0000(0000) knlGS:ffffffff80872980
CS:  0010 DS: 002b ES: 002b CR0: 000000008005003b
CR2: 0000000000000030 CR3: 000000017a59a000 CR4: 00000000000026f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process qemu-system-x86 (pid: 3353, threadinfo ffff880101130000, task
ffff88007d
e87080)
Stack:
 ffffffffa0014aae ffff8801011317c8 ffff88006a9ea130 ffff880162618040
 ffff880076373068 0000000000056a0d 800000010b838203 ffff8801011317a8
 ffffffffa001543c ffff8801011317e8 ffffffffa0014a72 ffff8801011317c8
Call Trace:
 [<ffffffffa0014aae>] ? T.927+0x84/0xae [kvm]
 [<ffffffffa001543c>] mark_unsync+0x28/0x2a [kvm]
 [<ffffffffa0014a72>] T.927+0x48/0xae [kvm]
 [<ffffffffa001543c>] mark_unsync+0x28/0x2a [kvm]
 [<ffffffffa0014a72>] T.927+0x48/0xae [kvm]
 [<ffffffffa00156bd>] set_spte+0x27f/0x349 [kvm]
 [<ffffffffa0015882>] mmu_set_spte+0xfb/0x328 [kvm]
 [<ffffffffa0015c5f>] __direct_pte_prefetch+0x1b0/0x1ff [kvm]
 [<ffffffffa0011954>] ? gfn_to_rmap+0x12/0x4d [kvm]
 [<ffffffffa0017473>] paging64_page_fault+0x450/0x6b3 [kvm]
 [<ffffffffa00141fd>] kvm_mmu_page_fault+0x24/0x7f [kvm]
 [<ffffffffa0c3d6b4>] handle_exception+0x19f/0x31f [kvm_intel]
 [<ffffffffa000167d>] ? kvm_vcpu_block+0x31/0xa9 [kvm]
 [<ffffffffa0c40745>] vmx_handle_exit+0x5e4/0x613 [kvm_intel]
 [<ffffffffa000e698>] kvm_arch_vcpu_ioctl_run+0xa13/0xd92 [kvm]
 [<ffffffffa000e5fe>] ? kvm_arch_vcpu_ioctl_run+0x979/0xd92 [kvm]
 [<ffffffffa0c3eda6>] ? vmx_vcpu_load+0x2e/0x180 [kvm_intel]
 [<ffffffffa000d3d0>] ? kvm_arch_vcpu_load+0x8f/0x10b [kvm]
 [<ffffffffa000344f>] kvm_vcpu_ioctl+0x113/0x4e4 [kvm]
 [<ffffffffa0002d9d>] ? kvm_vm_ioctl+0x362/0x38b [kvm]
 [<ffffffff810add27>] do_vfs_ioctl+0x4a8/0x4f7
 [<ffffffff810a0d5a>] ? fget_light+0xdd/0xeb
 [<ffffffff810a0ccf>] ? fget_light+0x52/0xeb
 [<ffffffff810addb8>] sys_ioctl+0x42/0x65
 [<ffffffff81001f7b>] system_call_fastpath+0x16/0x1b
Code: 08 41 bc 01 00 00 00 eb 10 48 8b b3 70 03 00 00 48 89 df ff 93 20 03 00
00 48 83 c4 38 44 89 e0 5b 41 5c 41 5d 41 5e 41 5f c9 c3 <48> 2b 77 30 55 48 c1
ee 03 48 89 e5 0f ab 77 60 19 f6 85 f6 75 
RIP  [<ffffffffa0015414>] mark_unsync+0x0/0x2a [kvm]
 RSP <ffff880101131760>
CR2: 0000000000000030

-- 
Configure bugmail: https://bugzilla.kernel.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are watching the assignee of the bug.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html