Re: [PATCH v8 09/15] ARM: KVM: Memory virtualization setup

Christoffer Dall <c.dall@xxxxxxxxxxxxxxxxxxxxxx> · Thu, 28 Jun 2012 18:51:39 -0400

On Thu, Jun 28, 2012 at 6:34 PM, Marcelo Tosatti <mtosatti@xxxxxxxxxx> wrote:
> On Fri, Jun 15, 2012 at 03:08:22PM -0400, Christoffer Dall wrote:
>> From: Christoffer Dall <cdall@xxxxxxxxxxxxxxx>
>>
>> This commit introduces the framework for guest memory management
>> through the use of 2nd stage translation. Each VM has a pointer
>> to a level-1 table (the pgd field in struct kvm_arch) which is
>> used for the 2nd stage translations. Entries are added when handling
>> guest faults (later patch) and the table itself can be allocated and
>> freed through the following functions implemented in
>> arch/arm/kvm/arm_mmu.c:
>>  - kvm_alloc_stage2_pgd(struct kvm *kvm);
>>  - kvm_free_stage2_pgd(struct kvm *kvm);
>>
>> Further, each entry in TLBs and caches are tagged with a VMID
>> identifier in addition to ASIDs. The VMIDs are assigned consecutively
>> to VMs in the order that VMs are executed, and caches and tlbs are
>> invalidated when the VMID space has been used to allow for more than
>> 255 simultaenously running guests.
>>
>> The 2nd stage pgd is allocated in kvm_arch_init_vm(). The table is
>> freed in kvm_arch_destroy_vm(). Both functions are called from the main
>> KVM code.
>>
>> Signed-off-by: Christoffer Dall <c.dall@xxxxxxxxxxxxxxxxxxxxxx>
>
> Can you explain on a high level how the IPA -> PA mappings work?
>

the memory system on ARM with Virtualization Extensions is separated
into two stages: stage 1 and stage 2.

If stage 2 translation is disabled, which it is when we boot the host
kernel, only a three-level page table for stage 1 translations are
performed by the MMU and the result of the stage 1 translation is used
to physically access memory.

If stage 2 translation is enabled, the output of the stage 1
translation (which is a 40-bit intermediate physical address, IPA,
a.k.a. gpa_t/gfn in KVM language) is used for a stage 2 translation
that takes the 40-bit address as input and uses another set of page
tables, in 3 levels, to produce the resulting address.

If a fault happens during stage 1 translation, this fault is taken:
 a) directly by the host when the host is running and stage 2
translation is disabled
 b) directly by the guest VM

If a fault happens during stage 2 translation, the fault is always
taken by the hypervisor, which populates the missing entry in the
stage 2 page table (or changes the entry to be a writable entry in the
case of a permission fault).

During boot of a VM, the MMU is disabled for the guest Stage 1
translations and the address produced by the CPU is fed directly to
the stage 2 translation system.

A nice diagram is shown on page B3-1330 of the ARM arm I referred you
to in the other mail.

Let me know if this is the level you had in mind.

-Christoffer
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html