Re: [RFH] NULL pointer dereference oops occurs when running kvm VM

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks,

The bug is hard to reproduced, It appears only twice. Unfortunately,
I didn't open kdump before the bug appear.

I want to find clues from these logs. Of course I will try to reproduce it.


On 2016/8/12 17:34, Yadi wrote:
> On 2016年08月12日 17:08, Xiexiangyou wrote:
>> Kvm vm runs in hardware server with intel broadwell CPU. A oops exception occurs.
>>
>> kernel version: 3.0.93
>> kvm version: 3.6
>> CPU: And the CPU is Intel(R) Xeon(R) CPU E5-2618L v4 @ 2.20GHz.
>>
>> The message as follows :
>> <1>[25808.222049] BUG: unable to handle kernel NULL pointer dereference at           (null)
>> <1>[25808.230539] IP: [<ffffffffa021f3c5>] vcpu_enter_guest+0x555/0x790 [kvm]
>> <4>[25808.237496] PGD 0
>> <1>[25808.239839] Thread overran stack, or stack corrupted
> I am not for sure 100 percentage, just a suggestion: Turn on the stack depth checking functions to determine what is happening: echo 1 > /proc/sys/kernel/stack_tracer_enable
> 
>> <0>[25808.245107] Oops: 0002 [#1] SMP
>> <4>[25808.286629] CPU 2
>> <4>[25808.288464] Modules linked in: kbox(F) target_core_pscsi(F) target_core_file(F) target_core_iblock(F) die_notify(FN) signo_catch ipmi_devintf(F) ipmi_si(F) ipmi_msghandler(F) bonding(F) iptable_filter(F) ip_tables(F) x_tables(F) pmcint(F) openvswitch(F) gre(F) crc32c(F) libcrc32c(F) mperf(F) uhci_hcd(F) thermal(F) tg3(F) pcmcia(F) pcmcia_core(F) pciehp(F) pci_hotplug(F) nfs(F) lockd(F) fscache(F) auth_rpcgss(F) nfs_acl(F) sunrpc(F) mlx4_en(F) mlx4_core(F) compat(F) ide_cd_mod(F) ide_core(F) hpsa(F) fan(F) esp4(F) e1000(F) ata_generic(F) af_packet(F) vhost_scsi(F) target_core_mod(F) configfs(F) loop(F) dm_mod(F) ext3(F) jbd(F) mbcache(F) scsi_dh_rdac(F) scsi_dh_hp_sw(F) scsi_dh_emc(F) scsi_dh_alua(F) scsi_dh(F) mptsas(F) mptscsih(F) mptctl(F) mptbase(F) mpt2sas(F) scsi_transport_sas(F) raid_class(F) sd_mod(F) crc_t10dif(F) usbhid(F) hid(F) usb_storage(F) sr_mod(F) cdrom(F) vhost_net(F) macvtap(F) sg(F) macvlan(F) ixgbe(FX) tun(F) igb(F) ehci_hcd(F) kvm_intel(F) ipv6(F) dca(
>> F) ipv6_lib(F) kvm(F) usbcore(F) ptp(F) i2c_i801(F) i2c_core(F) usb_common(F) megaraid_sas(F) pps_core(F) rtc_cmos(F) processor(F) thermal_sys(F) hwmon(F) button(F) ata_piix(F) ahci(F) libahci(F) libata(F) scsi_mod(F) [last unloaded: kbox]
>> <4>[25808.402500] Supported: No, Unsupported modules are loaded
>> <4>[25808.408212]
>> <4>[25808.410023] Pid: 29180, comm: qemu-kvm Tainted: GF          NX 3.0.93-0.8-default #1 Huawei RH2288H V3/BC11HGSA0
>> <4>[25808.420863] RIP: 0010:[<ffffffffa021f3c5>]  [<ffffffffa021f3c5>] vcpu_enter_guest+0x555/0x790 [kvm]
>> <4>[25808.430560] RSP: 0018:ffff882fe1141d88  EFLAGS: 00010046
>> <4>[25808.436179] RAX: 0000000000000000 RBX: ffffffffa00d5270 RCX: 0000000000000000
>> <4>[25808.443617] RDX: ffff88187f88cee0 RSI: 0000000000000000 RDI: 0000000000000002
>> <4>[25808.451049] RBP: ffff8817bfba8140 R08: ffff8817c28be4c0 R09: 0000000000000000
>> <4>[25808.458490] R10: ffff8817bfbac100 R11: ffffffff81017ea0 R12: 0000000000000000
>> <4>[25808.465933] R13: ffff8817bfba8170 R14: 0000000000000000 R15: 0000000000000000
>> <4>[25808.473374] FS:  00007f1083e60700(0000) GS:ffff88187f880000(0000) knlGS:0000000000000000
>> <4>[25808.482088] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> <4>[25808.488145] CR2: 0000000000000000 CR3: 00000017c2a13000 CR4: 00000000001427e0
>> <4>[25808.495577] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> <4>[25808.503013] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> <4>[25808.510447] Process qemu-kvm (pid: 29180, threadinfo ffff882fe1140000, task ffff882fde8ac3c0)
>> <0>[25808.519588] Stack:
>> <4>[25808.521921]  ffff8817bfba8140 ffff882fde8ac3c0 0000000000000206 ffffffffa020b9e5
>> <4>[25808.530016]  0000000000000000 ffff882fde8ac3c0 ffffffff81082f10 ffff882fe1141dc0
>> <4>[25808.538103]  ffff882fe1141dc0 ffff8817bfba8140 ffff8817bfba8170 0000000000000001
>> <0>[25808.546203] Call Trace:
>> <4>[25808.549010]  [<ffffffffa021f798>] __vcpu_run+0x198/0x260 [kvm]
>> <4>[25808.562703]  [<ffffffffa0220418>] kvm_arch_vcpu_ioctl_run+0x68/0x1a0 [kvm]
>> <4>[25808.569851]  [<ffffffffa020ccee>] kvm_vcpu_ioctl+0x38e/0x580 [kvm]
>> <4>[25808.576344]  [<ffffffff8116bf6b>] do_vfs_ioctl+0x8b/0x3b0
>> <4>[25808.582059]  [<ffffffff8116c331>] sys_ioctl+0xa1/0xb0
>> <4>[25808.587425]  [<ffffffff81469872>] system_call_fastpath+0x16/0x1b
>> <4>[25808.593753]  [<00007f10871c1ce7>] 0x7f10871c1ce6
>> <0>[25808.598630] Code: 65 24 85 c0 74 25 48 8b 1d 99 30 04 00 48 85 db 74 19 48 8b 03 90 48 8b 7b 08 48 83 c3 10 44 89 e6 ff d0 48 8b 03 48 85 c0 75 eb
>>       [25808.612804] <48> 8b 05 14 51 04 00 48 89 ef ff 90 48 01 00 00 65 48 8b 04 25
>> <1>[25808.620776] RIP  [<ffffffffa021f3c5>] vcpu_enter_guest+0x555/0x790 [kvm]
>> <4>[25808.627768]  RSP <ffff882fe1141d88>
>> <0>[25808.631572] CR2: 0000000000000000
>>
>>
>> The assembly instruction of "RIP vcpu_enter_guest+0x555" is:”mov    0x45114(%rip),%rax”
>>
>> The assembly code is:
>> 0xffffffffa02f73bd <vcpu_enter_guest+1357>:     mov    (%rbx),%rax
>> 0xffffffffa02f73c0 <vcpu_enter_guest+1360>:     test   %rax,%rax
>> 0xffffffffa02f73c3 <vcpu_enter_guest+1363>:     jne    0xffffffffa02f73b0 <vcpu_enter_guest+1344>
>> 0xffffffffa02f73c5 <vcpu_enter_guest+1365>:     mov    0x45114(%rip),%rax        # 0xffffffffa033c4e0 <kvm_x86_ops>
>> 0xffffffffa02f73cc <vcpu_enter_guest+1372>:     mov    %rbp,%rdi
>> 0xffffffffa02f73cf <vcpu_enter_guest+1375>:     callq  *0x148(%rax)
>>
>> It's impossible that the instruction "mov  0x45114(%rip),%rax" make the BUG like "unable to handle kernel NULL pointer dereference at (null)",
>> Have anyone met the issue before? Is it a CPU bug?
>>
>> Best regards!
>> ��칻�&�~�&���+-��ݶ��w��˛���m�/�)���w*jg��������ݢj/���z�ޖ��2�ޙ���&�)ߡ�a�����G���h��j:+v���w�٥
> 
��.n��������+%������w��{.n�����o�^n�r������&��z�ޗ�zf���h���~����������_��+v���)ߣ�

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux