Re: KVM/ARM status and branches

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Sep 10, 2012 at 4:04 PM, Marc Zyngier <marc.zyngier@xxxxxxx> wrote:
> On Mon, 10 Sep 2012 10:32:04 -0400, Christoffer Dall
> <c.dall@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>> On Mon, Sep 10, 2012 at 6:18 AM, Marc Zyngier <marc.zyngier@xxxxxxx>
> wrote:
>>> On 10/09/12 05:04, Christoffer Dall wrote:
>>>> Hello,
>>>>
>>>> We have a new branch, which will never be rebased and should always be
>>>> bisectable and mergable. It's kvm-arm-master and can be found here:
>>>>
>>>> git://github.com/virtualopensystems/linux-kvm-arm.git kvm-arm-master
>>>>
>>>> (or pointy-clicky web interface:)
>>>> https://github.com/virtualopensystems/linux-kvm-arm
>>>>
>>>> This branch merges 3.6-rc5
>>>>
>>>> The branch also merges all Marc Zyngier's timer, vgic and hyp-mode
>>>> boot branches.
>>>>
>>>> It is also merged with the IRQ injection API changes (touched
>>>> KVM_IRQ_LINE) as there hasn't been any other comments on this. This
>>>> requires qemu patches, which can be found here:
>>>>
>>>> git://github.com/virtualopensystems/qemu.git kvm-arm-irq-api
>>>>
>>>> (or pointy-clicky web interface:)
>>>> https://github.com/virtualopensystems/qemu
>>>>
>>>> Two things are outstanding on my end before I attempt an initial
>>>> upstream;
>>>>  1. We have a bug when we start swapping in the host, the guest kernel
>>>> dies with "BUG: Bad page state..." and all sort of bad things follow.
>>>> If we really stress the host on memory pressure it seems that host can
>>>> also crash, or at least become completely unresponsive. The same test
>>>> on a KVM kernel without any VMs does not cause this BUG.
>>>
>>> Is that the one you're seeing?
>>>
>>> [  312.189234] ------------[ cut here ]------------
>>> [  312.203056] kernel BUG at arch/arm/kvm/mmu.c:382!
>>> [  312.217134] Internal error: Oops - BUG: 0 [#1] PREEMPT SMP THUMB2
>>> [  312.235376] Modules linked in:
>>> [  312.244515] CPU: 0    Not tainted  (3.6.0-rc3+ #40)
>>> [  312.259118] PC is at stage2_clear_pte+0x128/0x134
>>> [  312.273193] LR is at kvm_unmap_hva+0x97/0xa0
>>> [  312.285967] pc : [<c001e10c>]    lr : [<c001ee0f>]    psr: 60000133
>>> [  312.285967] sp : caa25998  ip : df97a028  fp : 00800000
>>> [  312.320355] r10: 873b5b5f  r9 : c8654000  r8 : 01c55000
>>> [  312.335990] r7 : 00000000  r6 : df249c00  r5 : c688fb80  r4 :
> df249ccc
>>> [  312.355532] r3 : 00000000  r2 : 2e001000  r1 : 00000000  r0 :
> 00000000
>>> [  312.375076] Flags: nZCv  IRQs on  FIQs on  Mode SVC_32  ISA Thumb
>>> Segment user
>>> [  312.396962] Control: 70c5387d  Table: 8a9bbb00  DAC: fffffffd
>>> [  312.414161] Process hackbench (pid: 7207, stack limit = 0xcaa242f8)
>>>
>>
>> FYI, this is what I'm seeing in the guest in more details (this
>> couldn't be the icache stuff could it?):
>
> [...]
>
> I do see similar things - and some others. It is really random.
>
> I tried nuking the icache without any success. I spent the whole day
> adding flushes on every code paths, without making a real difference. And
> the more I think of it, the more I'm convinced that this is caused by the
> way we manipulate pages without telling the kernel what we're actually
> doing.
>
> What happens is that as far as the kernel is concerned, the qemu pages are
> always clean. We never flag a page dirty, because it is the guest that
> performs the write, and we're completely oblivious of that path. What I
> think happens is that the guest writes some data to the cache (or even to
> memory) and the underlying page gets evicted without being sync-ed first,
> because nobody knows it's been modified.
>
> If my gut feeling is true, we need to tell the kernel that as soon as a
> page is inserted in stage-2, it is assumed to be dirty. We could always
> mark them read-only and resolve the fault at a later time, but that isn't
> important at the moment. And we need to flag it in the qemu mapping,
> because it is the one being evicted.
>
> What do you think?
>
I think this is definitely a good bet, I remember Alex Graf saying
something about KVM taking care of the dirty bit for us, but I'm not
sure.

We already mark pages read-only if that makes sense, so we could avoid
setting the dirty bit there.

I will try this out right away.

-Christoffer
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/cucslists/listinfo/kvmarm


[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux