Re: KVM on ARM crashes with new VGIC v4.7-rc7

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2016-07-23 03:20, Marc Zyngier wrote:
> On Sat, 23 Jul 2016 00:45:50 -0700
> Stefan Agner <stefan@xxxxxxxx> wrote:
> 
>> On 2016-07-22 11:11, Marc Zyngier wrote:
>> > On 22/07/16 18:56, Stefan Agner wrote:
>> >> On 2016-07-22 10:49, Marc Zyngier wrote:
>> >>> On 22/07/16 18:38, Andrew Jones wrote:
>> >>>> On Fri, Jul 22, 2016 at 04:40:15PM +0100, Marc Zyngier wrote:
>> >>>>> On 22/07/16 15:35, Andrew Jones wrote:
>> >>>>>> On Fri, Jul 22, 2016 at 11:42:02AM +0100, Andre Przywara wrote:
>> >>>>>>> Hi Stefan,
>> >>>>>>>
>> >>>>>>> On 22/07/16 06:57, Stefan Agner wrote:
>> >>>>>>>> Hi,
>> >>>>>>>>
>> >>>>>>>> I tried KVM on a Cortex-A7 platform (i.MX 7Dual SoC) and encountered
>> >>>>>>>> this stack trace immediately after invoking qemu-system-arm:
>> >>>>>>>>
>> >>>>>>>> Unable to handle kernel paging request at virtual address ffffffe4
>> >>>>>>>> pgd = 8ca52740
>> >>>>>>>> [ffffffe4] *pgd=80000080007003, *pmd=8ff7e003, *pte=00000000
>> >>>>>>>> Internal error: Oops: 207 [#1] SMP ARM
>> >>>>>>>> Modules linked in:
>> >>>>>>>> CPU: 0 PID: 329 Comm: qemu-system-arm Tainted: G        W
>> >>>>>>>> 4.7.0-rc7-00094-gea3ed2c #109
>> >>>>>>>> Hardware name: Freescale i.MX7 Dual (Device Tree)
>> >>>>>>>> task: 8ca3ee40 ti: 8d2b0000 task.ti: 8d2b0000
>> >>>>>>>> PC is at do_raw_spin_lock+0x8/0x1dc
>> >>>>>>>> LR is at kvm_vgic_flush_hwstate+0x8c/0x224
>> >>>>>>>> pc : [<8027c87c>]    lr : [<802172d4>]    psr: 60070013
>> >>>>>>>> sp : 8d2b1e38  ip : 8d2b0000  fp : 00000001
>> >>>>>>>> r10: 8d2b0000  r9 : 00010000  r8 : 8d2b8e54
>> >>>>>>>> fec 30be0000.ethernet eth0: MDIO read timeout
>> >>>>>>>> r7 : 8d2b8000  r6 : 8d2b8e74  r5 : 00000000  r4 : ffffffe0
>> >>>>>>>> r3 : 00004ead  r2 : 00000000  r1 : 00000000  r0 : ffffffe0
>> >>>>>>>> Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment user
>> >>>>>>>> Control: 30c5387d  Table: 8ca52740  DAC: fffffffd
>> >>>>>>>> Process qemu-system-arm (pid: 329, stack limit = 0x8d2b0210)
>> >>>>>>>> Stack: (0x8d2b1e38 to 0x8d2b2000)
>> >>>>>>>> 1e20:                                                       ffffffe0
>> >>>>>>>> 00000000
>> >>>>>>>> 1e40: 8d2b8e74 8d2b8000 8d2b8e54 00010000 8d2b0000 802172d4 8d2b8000
>> >>>>>>>> 810074f8
>> >>>>>>>> 1e60: 81007508 8ca5f800 8d284000 00010000 8d2b0000 8020fbd4 8ce9a000
>> >>>>>>>> 8ca5f800
>> >>>>>>>> 1e80: 00000000 00010000 00000000 00ff0000 8d284000 00000000 00000000
>> >>>>>>>> 7ffbfeff
>> >>>>>>>> 1ea0: fffffffe 00000000 8d28b780 00000000 755fec6c 00000000 00000000
>> >>>>>>>> ffffe000
>> >>>>>>>> 1ec0: 8d2b8000 00000000 8d28b780 00000000 755fec6c 8020af90 00000000
>> >>>>>>>> 8023f248
>> >>>>>>>> 1ee0: 0000000a 755fe98c 8d2b1f08 00000008 8021aa84 ffffe000 00000000
>> >>>>>>>> 00000000
>> >>>>>>>> 1f00: 8a00d860 8d28b780 80334f94 00000000 8d2b0000 80334748 00000000
>> >>>>>>>> 00000000
>> >>>>>>>> 1f20: 00000000 8d28b780 00004000 00000009 8d28b500 00000024 8104ebee
>> >>>>>>>> 80bc2ec4
>> >>>>>>>> 1f40: 80bafa24 8034138c 00000000 00000000 80341248 00000000 755fec6c
>> >>>>>>>> 007c1e70
>> >>>>>>>> 1f60: 00000009 00004258 0000ae80 8d28b781 00000009 8d28b780 0000ae80
>> >>>>>>>> 00000000
>> >>>>>>>> 1f80: 8d2b0000 00000000 755fec6c 80334f94 007c1e70 322a7400 00004258
>> >>>>>>>> 00000036
>> >>>>>>>> 1fa0: 8021aa84 8021a900 007c1e70 322a7400 00000009 0000ae80 00000000
>> >>>>>>>> 755feac0
>> >>>>>>>> 1fc0: 007c1e70 322a7400 00004258 00000036 7e9aff58 01151da4 76f8b4c0
>> >>>>>>>> 755fec6c
>> >>>>>>>> 1fe0: 0038192c 755fea9c 00048ae7 7697d66c 60070010 00000009 00000000
>> >>>>>>>> 00000000
>> >>>>>>>> [<8027c87c>] (do_raw_spin_lock) from [<802172d4>]
>> >>>>>>>> (kvm_vgic_flush_hwstate+0x8c/0x224)
>> >>>>>>>> [<802172d4>] (kvm_vgic_flush_hwstate) from [<8020fbd4>]
>> >>>>>>>> (kvm_arch_vcpu_ioctl_run+0x110/0x478)
>> >>>>>>>> [<8020fbd4>] (kvm_arch_vcpu_ioctl_run) from [<8020af90>]
>> >>>>>>>> (kvm_vcpu_ioctl+0x2e0/0x6d4)
>> >>>>>>>> [<8020af90>] (kvm_vcpu_ioctl) from [<80334748>]
>> >>>>>>>> (do_vfs_ioctl+0xa0/0x8b8)
>> >>>>>>>> [<80334748>] (do_vfs_ioctl) from [<80334f94>] (SyS_ioctl+0x34/0x5c)
>> >>>>>>>> [<80334f94>] (SyS_ioctl) from [<8021a900>] (ret_fast_syscall+0x0/0x1c)
>> >>>>>>>> Code: e49de004 ea09ea24 e92d47f0 e3043ead (e5902004)
>> >>>>>>>> ---[ end trace cb88537fdc8fa206 ]---
>> >>>>>>>>
>> >>>>>>>> I use CONFIG_KVM_NEW_VGIC=y. This happens to me with a rather minimal
>> >>>>>>>> qemu invocation (qemu-system-arm -enable-kvm -M virt -cpu host
>> >>>>>>>> -nographic -serial stdio -kernel zImage).
>> >>>>>>>>
>> >>>>>>>> Using a bit older Qemu version 2.4.0.
>> >>>>>>>
>> >>>>>>> I just tried with a self compiled QEMU 2.4.0 and the Ubuntu 14.04
>> >>>>>>> provided 2.0.0, it worked fine with Linus' current HEAD as a host kernel
>> >>>>>>> on a Midway (Cortex-A15).
>> >>>>>>
>> >>>>>> I can reproduce the issue with a latest QEMU build on AMD Seattle
>> >>>>>> (I haven't tried anywhere else yet)
>> >>>>>>
>> >>>>>>>
>> >>>>>>> Can you try to disable the new VGIC, just to see if that's a regression?
>> >>>>>>
>> >>>>>> Disabling NEW_VGIC "fixes" guest boots.
>> >>>>>>
>> >>>>>> I'm not using defconfig for my host kernel. I'll do a couple more
>> >>>>>> tests and provide a comparison of my config vs. a defconfig in
>> >>>>>> a few minutes.
>> >>>>>
>> >>>>> Damn. It is not failing for me, so it has to be a kernel config thing...
>> >>>>> If you can narrow it down to the difference with defconfig, that'd be
>> >>>>> tremendously helpful.
>> >>>>
>> >>>> It's PAGE_SIZE; 64K doesn't work, 4K does, regardless of VA_BITS
>> >>>> selection.
>> >>>
>> >>> That definitely doesn't match Stefan's report (32bit only has 4k). I'll
>> >>
>> >> Hehe, was just plowing through code and came to that conclusion, glad I
>> >> got that right :-)
>> >>
>> >> What defconfig do you use? I could reproduce the issue also with
>> >> multi_v7_defconfig + ARM_LPAE + KVM.
>> >
>> > I have my own config file with the crap I need to make things work on
>> > the various platforms I have around. If multi_v7_defconfig works on the
>> > cubietruck, I'll give it a spin tomorrow. I need a beer now.
>> >
>> >> Btw, I am not exactly on vanilla 4.7-rc7, I merged Shawns for-next +
>> >> clock next to get to the bits and pieces required for my board...
>> >>
>> >> That said, it works fine otherwise, and the stacktrace looks rather
>> >> platform independent...
>> >
>> > Yeah, and that's the worrying part.
>>
>>
>> FWIW, I tried here with Qemu 2.6.0, same stack trace...
> 
> I don't think this is userspace related, specially given that Andrew
> managed to trigger it on arm64 as well. I guess we're looking at
> something that changes the layout of memory (page size in Drew's case),
> and exposes another latent bug. I'll try to get multi_v7_defconfig
> running on my CT later today, and hopefully the thing will explode.
> Fingers crossed.
> 

I hit another issue, this time in the guest. At times, it seemed as if
qemu-system-arm freezed (no console output). I then enabled earlyprintk
for PL01X UART, and got this:

Architected timer frequency not available
------------[ cut here ]------------
WARNING: CPU: 0 PID: 0 at kernel/time/clockevents.c:44
cev_delta2ns+0x114/0x128
Modules linked in:
CPU: 0 PID: 0 Comm: swapper/0 Not tainted 4.7.0-rc7 #5
Hardware name: Generic DT based system
[<8010e2a0>] (unwind_backtrace) from [<8010b270>] (show_stack+0x10/0x14)
[<8010b270>] (show_stack) from [<8030e734>] (dump_stack+0x84/0x98)
[<8030e734>] (dump_stack) from [<8011a718>] (__warn+0xe8/0x100)
[<8011a718>] (__warn) from [<8011a7e0>] (warn_slowpath_null+0x20/0x28)
[<8011a7e0>] (warn_slowpath_null) from [<8017503c>]
(cev_delta2ns+0x114/0x128)
[<8017503c>] (cev_delta2ns) from [<8017548c>]
(clockevents_config.part.2+0x4c/0x6c)
[<8017548c>] (clockevents_config.part.2) from [<801754cc>]
(clockevents_config_and_register+0x20/0x2c)
[<801754cc>] (clockevents_config_and_register) from [<80435d9c>]
(arch_timer_setup+0xd8/0x1b4)
[<80435d9c>] (arch_timer_setup) from [<8081c118>]
(arch_timer_of_init+0x2a0/0x2c8)
[<8081c118>] (arch_timer_of_init) from [<8081bb14>]
(clocksource_probe+0x54/0x90)
[<8081bb14>] (clocksource_probe) from [<80800b30>]
(start_kernel+0x240/0x378)
[<80800b30>] (start_kernel) from [<4000807c>] (0x4000807c)
---[ end trace cb88537fdc8fa200 ]---
Architected cp15 timer(s) running at 0.00MHz (virt).
Division by zero in kernel.
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.7.0-rc7 #5
Hardware name: Generic DT based system
[<8010e2a0>] (unwind_backtrace) from [<8010b270>] (show_stack+0x10/0x14)
[<8010b270>] (show_stack) from [<8030e734>] (dump_stack+0x84/0x98)
[<8030e734>] (dump_stack) from [<8030c764>] (Ldiv0_64+0x8/0x18)
[<8030c764>] (Ldiv0_64) from [<80172360>]
(clocks_calc_max_nsecs+0x24/0x78)
[<80172360>] (clocks_calc_max_nsecs) from [<801725d8>]
(__clocksource_update_freq_scale+0x224/0x2fc)
[<801725d8>] (__clocksource_update_freq_scale) from [<801726c4>]
(__clocksource_register_scale+0x14/0xa8)
[<801726c4>] (__clocksource_register_scale) from [<8081be20>]
(arch_timer_common_init+0x1d8/0x230)
[<8081be20>] (arch_timer_common_init) from [<8081c0dc>]
(arch_timer_of_init+0x264/0x2c8)
[<8081c0dc>] (arch_timer_of_init) from [<8081bb14>]
(clocksource_probe+0x54/0x90)
[<8081bb14>] (clocksource_probe) from [<80800b30>]
(start_kernel+0x240/0x378)
[<80800b30>] (start_kernel) from [<4000807c>] (0x4000807c)
clocksource: arch_sys_counter: mask: 0xffffffffffffff max_cycles: 0x0,
max_idle_ns: 0 ns
Division by zero in kernel.
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.7.0-rc7 #5
Hardware name: Generic DT based system
[<8010e2a0>] (unwind_backtrace) from [<8010b270>] (show_stack+0x10/0x14)
[<8010b270>] (show_stack) from [<8030e734>] (dump_stack+0x84/0x98)
[<8030e734>] (dump_stack) from [<8030c764>] (Ldiv0_64+0x8/0x18)
[<8030c764>] (Ldiv0_64) from [<80172288>]
(clocks_calc_mult_shift+0x11c/0x13c)
[<80172288>] (clocks_calc_mult_shift) from [<8080a388>]
(sched_clock_register+0x64/0x1d8)
[<8080a388>] (sched_clock_register) from [<8081be54>]
(arch_timer_common_init+0x20c/0x230)
[<8081be54>] (arch_timer_common_init) from [<8081c0dc>]
(arch_timer_of_init+0x264/0x2c8)
[<8081c0dc>] (arch_timer_of_init) from [<8081bb14>]
(clocksource_probe+0x54/0x90)
[<8081bb14>] (clocksource_probe) from [<80800b30>]
(start_kernel+0x240/0x378)
[<80800b30>] (start_kernel) from [<4000807c>] (0x4000807c)
Division by zero in kernel.
CPU: 0 PID: 0 Comm: swapper/0 Tainted: G        W       4.7.0-rc7 #5
Hardware name: Generic DT based system
[<8010e2a0>] (unwind_backtrace) from [<8010b270>] (show_stack+0x10/0x14)
[<8010b270>] (show_stack) from [<8030e734>] (dump_stack+0x84/0x98)
[<8030e734>] (dump_stack) from [<8030c764>] (Ldiv0_64+0x8/0x18)
[<8030c764>] (Ldiv0_64) from [<80172360>]
(clocks_calc_max_nsecs+0x24/0x78)
[<80172360>] (clocks_calc_max_nsecs) from [<8080a3cc>]
(sched_clock_register+0xa8/0x1d8)
[<8080a3cc>] (sched_clock_register) from [<8081be54>]
(arch_timer_common_init+0x20c/0x230)
[<8081be54>] (arch_timer_common_init) from [<8081c0dc>]
(arch_timer_of_init+0x264/0x2c8)
[<8081c0dc>] (arch_timer_of_init) from [<8081bb14>]
(clocksource_probe+0x54/0x90)
[<8081bb14>] (clocksource_probe) from [<80800b30>]
(start_kernel+0x240/0x378)
[<80800b30>] (start_kernel) from [<4000807c>] (0x4000807c)
sched_clock: 56 bits at 0 Hz, resolution 0ns, wraps every 0ns
Console: colour dummy device 80x30
Calibrating delay loop... 


When it works (which tends to be around every 5. try), then the clock of
the Architected timer seems to be correctly identified:
Architected cp15 timer(s) running at 8.00MHz (virt).

Host looks good:
# dmesg | grep Architected
[    0.000000] Architected cp15 timer(s) running at 8.00MHz (phys).

Afaict, U-Boot correctly initializes the timers frequency in
arch/arm/imx-common/syscounter.c.

The guest is using a vanilla v4.7-rc7 kernel. 

The host is running without CONFIG_KVM_NEW_VGIC.

Looks like some kind of race during initialization...? Related to the
new VGIC issue?

--
Stefan
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm



[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux