MMIO regions and emulation code re-factor

cd2436 at columbia.edu (Christoffer Dall) · Tue, 13 Oct 2009 23:20:37 -0400

QEMU and MMIO regions:
------------------------------------
I think I have solved the first hurdle regarding QEMU<->KVM io mappings.

When QEMU registers a memory region with its virtual CPU which is
never really allocated as user memory, the KVM interaction code sees
such a memory region as IO_MEM_UNASSIGNED and never registers the
memory region with KVM. However, KVM has already an API to register
these type of memory regions through the IOCTL called
KVM_SET_MEMORY_REGION (as opposed to KVM_SET_USER_MEMORY_REGION). This
IOCTl must be implemented in the kernel per host architecture and
handled correspondingly. I have done this by registering the memory
regions as usual with KVM, but setting the existing user_alloc flag to
0 on the kvm_memory_slot structure.

Thus on shadow page fault, instead of trying to translate the guest
physical address to a host virtual address, I first find the
corresponding memory slot and check if it has user_alloc set. In that
case I translate to the host virtual address and add the mapping in
the shadow page tables and otherwise, I regard the access as an MMIO
operation.

For the MMIO operation itself, it seems to be fairly straight forward
just setting the vcpu->run.exit_reason = KVM_EXIT_MMIO and filling in
the vcpu->run->mmio structure as well. However, the fields in the mmio
struct are:
 - phys_addr
 - data
 - len
 - is_write

To fill in this data (particularly data and len) I see no way out of
analyzing the faulting instruction and emualting its behavior. Which
leads me to the next point

Emulation code re-factoring:
--------------------------------------
There is quite a lot of well-working code in the
arch/arm/kvm/arm_emulate.c file, which is currently used mainly for
emulating co-processor and status-register instructions, but there is
some support for load-store and data processing instructions in there
as well. Unfortunately the code doesn't follow any kernel coding
standards and is generally hard to read and follow. What's worse is
that the code is not sufficiently modularized to be used to analyze
instructions in the MMIO case. For these reasons I am trying to
re-factor the code somewhat.

I can think of these scenarios in which it will be relevant to emulate
instructions:
 - When a sensitive instruction is executed and the guest is in priv. mode
 - Load-store instructions accessing MMIO regions
 - Load-store instructions accessing reserved regions such as the
interrupt vector page.

The sensitive instructions can be divided into these categories:
 - Data-processing instructions with S bit set and PC as destination
(changes priv. mode)
 - Program status register instructions
 - Co-processor instructions
 - Load-store instructions that modify user-mode registers (when guest
is in priv. mode)

If anybody can think of other cases where we need to emulate
instructions, now is the time to speak up :)

The design I'm working on for this code is to separate the emulation
code into small functions as much as possible and make this as
state-independent as possible. For instance instead of simply passing
the vcpu struct pointer to a function called emulate_instr, which then
modified the state of the vcpu as it is now, there would be several
smaller functions in style with the suggestion below.

Any suggestions on improving this are most welcome:

Emulation example code:
----------------------------------

u32 decode_ls_address(u32 instr);
u32 get_store_value(u32 instr);
u8 get_load_dest_register(u32 instr);
u32 get_ls_instr_datalen(u32 instr);

int emulate_ls_instr(u32 instr)
{
    u32 addr = decode_ls_address(instr);
    hva_t hva = gva_to_hva(addr);

    switch (get_ls_instr_index(instr)) {
    case ARM_LS_INSTR_LDR: {
        char *dest = vcpu->regs + get_load_dest_register(instr);
        u32 len = get_ls_instr_datalen(instr);

        memcpy(dest, hva, len);
        break;
    }
    case ......
    ....
    ...
    return 0;
}

Thanks,
Christoffer