NULL pointer dereference in kernel code, ignored parameters in libkvm

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



    Hi. I'm a developer on the M5 simulator (m5sim.org) working on a CPU
model which uses kvm as its execution engine. I ran into a kernel "BUG"
where a NULL pointer is being dereferenced in gfn_to_rmap.

    What's happening on the kernel side is that gfn_to_rmap is calling
gfn_to_memslot. That function looks for the gfn in the memory slots,
fails to find it, and returns a NULL pointer. gfn_to_rmap then tries to
dereference it, and the kernel kills itself. I believe the original
source of the call to gfn_to_memslot was mmu_alloc_roots (in 2.6.28.9,
it may have moved) which tries to get the page pointed to by CR3 using
kvm_mmu_get_page. That part may not be correct, so here's the log output
from the kernel.

May 15 18:54:46 fajita BUG: unable to handle kernel NULL pointer
dereference at 0000000000000000
May 15 18:54:46 fajita IP: [<ffffffff802127b3>] gfn_to_rmap+0x17/0x48
May 15 18:54:46 fajita PGD 136051067 PUD 1299fd067 PMD 0
May 15 18:54:46 fajita Oops: 0000 [#1] SMP
May 15 18:54:46 fajita last sysfs file: /sys/power/state
May 15 18:54:46 fajita CPU 0
May 15 18:54:46 fajita Modules linked in: snd_hda_intel nvidia(P)
snd_pcm snd_timer snd iwlagn snd_page_alloc
May 15 18:54:46 fajita Pid: 7325, comm: m5.opt Tainted: P          
2.6.28.9 #2
May 15 18:54:46 fajita RIP: 0010:[<ffffffff802127b3>] 
[<ffffffff802127b3>] gfn_to_rmap+0x17/0x48
May 15 18:54:46 fajita RSP: 0018:ffff880129963cf8  EFLAGS: 00010246
May 15 18:54:46 fajita RAX: 0000000000000000 RBX: 0000000000000000 RCX:
0000000000000000
May 15 18:54:46 fajita RDX: 0000000000000000 RSI: 0000000000000070 RDI:
ffff8801268d8000
May 15 18:54:46 fajita RBP: 0000000000000070 R08: 000000000000000a R09:
0000000000000000
May 15 18:54:46 fajita R10: 000000000000008b R11: 0000000000000002 R12:
0000000000000070
May 15 18:54:46 fajita R13: 0000000000000000 R14: 000000000000ae80 R15:
0000000000000070
May 15 18:54:46 fajita FS:  0000000041e1d950(0063)
GS:ffffffff80ab2040(0000) knlGS:0000000000000000
May 15 18:54:46 fajita CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
May 15 18:54:46 fajita CR2: 0000000000000000 CR3: 0000000129909000 CR4:
00000000000026e0
May 15 18:54:46 fajita DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
May 15 18:54:46 fajita DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
May 15 18:54:46 fajita Process m5.opt (pid: 7325, threadinfo
ffff880129962000, task ffff88013a1eacd0)
May 15 18:54:46 fajita Stack:
May 15 18:54:46 fajita ffff88013aba6800 ffff8801299727b0
ffff8801268d8000 ffffffff80213abe
May 15 18:54:46 fajita 00000000000080d0 ffff8801299727b0
ffff88012f040590 00000000000e0044
May 15 18:54:46 fajita ffff880129972040 ffffffff80213eeb
ffff88013b282380 0000000000000246
May 15 18:54:46 fajita Call Trace:
May 15 18:54:46 fajita [<ffffffff80213abe>] ? rmap_write_protect+0x25/0x123
May 15 18:54:46 fajita [<ffffffff80213eeb>] ? kvm_mmu_get_page+0x2cb/0x320
May 15 18:54:46 fajita [<ffffffff80214f51>] ? kvm_mmu_load+0x80/0x1b1
May 15 18:54:46 fajita [<ffffffff806db286>] ? __down_read+0x12/0x93
May 15 18:54:46 fajita [<ffffffff8020fc9c>] ?
kvm_arch_vcpu_ioctl_run+0x1ce/0x621
May 15 18:54:46 fajita [<ffffffff8020b590>] ? kvm_vcpu_ioctl+0xf2/0x448
May 15 18:54:46 fajita [<ffffffff80287a8d>] ? handle_mm_fault+0x367/0x6dd
May 15 18:54:46 fajita [<ffffffff802ae03e>] ? vfs_ioctl+0x21/0x6b
May 15 18:54:46 fajita [<ffffffff802ae402>] ? do_vfs_ioctl+0x37a/0x3c1
May 15 18:54:46 fajita [<ffffffff806dd616>] ? do_page_fault+0x444/0x806
May 15 18:54:46 fajita [<ffffffff80407353>] ? __up_write+0x21/0x10e
May 15 18:54:46 fajita [<ffffffff802ae485>] ? sys_ioctl+0x3c/0x5c
May 15 18:54:46 fajita [<ffffffff802234db>] ?
system_call_fastpath+0x16/0x1b
May 15 18:54:46 fajita Code: 26 21 80 48 89 f3 e8 33 ff ff ff 48 89 df
5b e9 c0 fe ff ff 55 48 89 f5 53 89 d3 48 83 ec 08 e8 60 78 ff ff 85 db
48 89 c1 75 11 <48> 2b 28 48 8d 14 ed 00 00 00 00 48 03 50 18 eb 19 48
8b 00 48
May 15 18:54:46 fajita RIP  [<ffffffff802127b3>] gfn_to_rmap+0x17/0x48
May 15 18:54:46 fajita RSP <ffff880129963cf8>
May 15 18:54:46 fajita CR2: 0000000000000000
May 15 18:54:46 fajita ---[ end trace 61dc41d5d0f7fc5f ]---



I looked in your git repository and this bug seems to be present in your
most recent code.

The second problem was the fact that CR3 didn't point to any memory even
though it had a valid value (0x7000). This was because our code relied
on kvm_create to set up physical memory, and while it takes parameters
for it and passes them around, it never actually seems to do anything
with them. This also seems to be the case in your most recent code.

The series of events leading to the BUG were then the following:

1. Our code calls kvm_create to create the vm and create its physical
memory, only the first of which happens.
2. Our code tries to start a CPU in that VM from a point where paging is
turned on and CR3 has a value that points into the physical memory that
doesn't exist.
3. The kernel code tries to get at the reverse mapping for the guest
page frame number.
4. Code below that tries to find the "slot" for that address, fails to
do so, but continues anyway, causing the kernel to dereference a NULL
pointer.
5. Kablooey.


I am a full time employee of VMware, and while I work on M5 on my own
time, that places certain limits on what I can do to help fix these
bugs. While I probably can't implement anything, I should be able to
provide more information about what we're doing with M5 or about the
crash if that would help.

Gabe Black
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux