Re: kvm oops vgic_v2_sync_lr_elrsr

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hej Christoffer,

On 09/22/2014 10:04 PM, Christoffer Dall wrote:
> Hi Andre,
> 
> On Mon, Sep 22, 2014 at 06:00:46PM +0100, Andre Przywara wrote:
>> Hi Riku,
>>
>> thanks for reporting that.
>> I found the root cause of it, it is a kernel bug introduced with -rc1,
>> during refactoring of the GIC code in preparation for GICv3 support.
>>
>> On 22/09/14 13:00, Riku Voipio wrote:
>>> Since last week, we have been getting and Ooops when launching kvm on
>>> mustang. This is with kvm-arm git with testing/mustang branch, but I
>>> also reproduced it with 3.17-rc6 (as attached on the log).
>>>
>>> PC is at set_bit+0x14/0x30
>>
>> set_bit is using ldxr, which requires the memory to be naturally
>> aligned. The bitmap is not, since it is composed of two u32's to make
>> the code easier to compile on 32-bit.
>>
>>> LR is at vgic_v2_sync_lr_elrsr+0x20/0x28
>>
>> This code snippet is only called when using level interrupts. kvmtool
>> and older QEMU version only use edge triggered interrupts and thus never
>> call this code.
>> I don't have a newest QEMU version running yet (due to having fun
>> cross-compiling the bloody glib), so cannot test here, but obviously it
>> uses level interrupts somewhere.
>> EDIT: it does:
>> http://git.qemu.org/?p=qemu.git;a=commitdiff;h=0be969a2d974971628fc4ed95834d22ecf0fd497
>>
>> Riku, can you dump the guest's device tree (-dumpdtb on QEMU cmdline) to
>> confirm that a device uses a level interrupt? They should have a 4 (or
>> 8) in the last of the three numbers in the interrupts property.
>>
>> I am just about to think about the best fix (__set_bit, reordering
>> members of the struct).
>>
>> Christoffer, any suggestions? Prepare for a sprint to get a fix into
>> before the release. Marc is out of office (Murphy's law).
>>
> Thanks for chasing down the cause of this one.  See my response to Joel
> with a one-line fix.

Thanks for the patch and the quick response! I thought about the
__set_bit fix too (with the very same reasoning as you ;-), but I felt
like it being a bit of a kludge solution. But that's probably OK given
the rush for the release.

> We should probably consider changing all the
> bit-ops in the vgic code to use non-atomic variants given that the whole
> manipulation is done while holding a spinlock anyway, but it would be
> good to measure the effects of this, stress-test the system a bit, and I
> don't want to introduce such an invasive change this late in a kernel
> release, so let's just focus on fixing this one issue first.

Fair enough, and I agree we should be using the non-atomic versions.
But I'd also like to see this casting thing fixed. Casting this clearly
unaligned u32 array to a long pointer gives me a bad feeling.
I will check tomorrow if there are any implications on the ordering of
the members (I remember Marc mentioning something in this area), if not,
we should move the u32 arrays upfront and maybe provide an union with an
u64 if that helps the compiler.

Will sent my Acked-by: once tested tomorrow.

Thanks and Regards,
Andre.

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux