Re: KVM/ARM status and branches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 10 Sep 2012 10:32:04 -0400, Christoffer Dall
<c.dall@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> On Mon, Sep 10, 2012 at 6:18 AM, Marc Zyngier <marc.zyngier@xxxxxxx>
wrote:
>> On 10/09/12 05:04, Christoffer Dall wrote:
>>> Hello,
>>>
>>> We have a new branch, which will never be rebased and should always be
>>> bisectable and mergable. It's kvm-arm-master and can be found here:
>>>
>>> git://github.com/virtualopensystems/linux-kvm-arm.git kvm-arm-master
>>>
>>> (or pointy-clicky web interface:)
>>> https://github.com/virtualopensystems/linux-kvm-arm
>>>
>>> This branch merges 3.6-rc5
>>>
>>> The branch also merges all Marc Zyngier's timer, vgic and hyp-mode
>>> boot branches.
>>>
>>> It is also merged with the IRQ injection API changes (touched
>>> KVM_IRQ_LINE) as there hasn't been any other comments on this. This
>>> requires qemu patches, which can be found here:
>>>
>>> git://github.com/virtualopensystems/qemu.git kvm-arm-irq-api
>>>
>>> (or pointy-clicky web interface:)
>>> https://github.com/virtualopensystems/qemu
>>>
>>> Two things are outstanding on my end before I attempt an initial
>>> upstream;
>>>  1. We have a bug when we start swapping in the host, the guest kernel
>>> dies with "BUG: Bad page state..." and all sort of bad things follow.
>>> If we really stress the host on memory pressure it seems that host can
>>> also crash, or at least become completely unresponsive. The same test
>>> on a KVM kernel without any VMs does not cause this BUG.
>>
>> Is that the one you're seeing?
>>
>> [  312.189234] ------------[ cut here ]------------
>> [  312.203056] kernel BUG at arch/arm/kvm/mmu.c:382!
>> [  312.217134] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP THUMB2
>> [  312.235376] Modules linked in:
>> [  312.244515] CPU: 0    Not tainted  (3.6.0-rc3+ #40)
>> [  312.259118] PC is at stage2_clear_pte+0x128/0x134
>> [  312.273193] LR is at kvm_unmap_hva+0x97/0xa0
>> [  312.285967] pc : [<c001e10c>]    lr : [<c001ee0f>]    psr: 60000133
>> [  312.285967] sp : caa25998  ip : df97a028  fp : 00800000
>> [  312.320355] r10: 873b5b5f  r9 : c8654000  r8 : 01c55000
>> [  312.335990] r7 : 00000000  r6 : df249c00  r5 : c688fb80  r4 :
df249ccc
>> [  312.355532] r3 : 00000000  r2 : 2e001000  r1 : 00000000  r0 :
00000000
>> [  312.375076] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb 
>> Segment user
>> [  312.396962] Control: 70c5387d  Table: 8a9bbb00  DAC: fffffffd
>> [  312.414161] Process hackbench (pid: 7207, stack limit = 0xcaa242f8)
>>
> 
> FYI, this is what I'm seeing in the guest in more details (this
> couldn't be the icache stuff could it?):

[...]

I do see similar things - and some others. It is really random.

I tried nuking the icache without any success. I spent the whole day
adding flushes on every code paths, without making a real difference. And
the more I think of it, the more I'm convinced that this is caused by the
way we manipulate pages without telling the kernel what we're actually
doing.

What happens is that as far as the kernel is concerned, the qemu pages are
always clean. We never flag a page dirty, because it is the guest that
performs the write, and we're completely oblivious of that path. What I
think happens is that the guest writes some data to the cache (or even to
memory) and the underlying page gets evicted without being sync-ed first,
because nobody knows it's been modified.

If my gut feeling is true, we need to tell the kernel that as soon as a
page is inserted in stage-2, it is assumed to be dirty. We could always
mark them read-only and resolve the fault at a later time, but that isn't
important at the moment. And we need to flag it in the qemu mapping,
because it is the one being evicted.

What do you think?

        M.
-- 
Fast, cheap, reliable. Pick two.
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm


[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux