Re: Q: Why not use struct mm_struct to manage guest physical addresses in new port?

David Daney <ddaney.cavm@xxxxxxxxx> · Fri, 08 Feb 2013 15:08:54 -0800

On 02/08/2013 02:11 PM, Marcelo Tosatti wrote:
On Tue, Feb 05, 2013 at 11:02:32AM -0800, David Daney wrote:
Hi,

I am starting to working on a port of KVM to an architecture that
has a dual TLB.  The Guest Virtual Addresses (GVA) are translated to
Guest Physical Addresses (GPA) by the first TLB, then a second TLB
translates the GPA to a Root Physical Address (RPA).  For the sake
of this question, we will ignore the GVA->GPA TLB and consider only
the GPA->RPA TLB.

I seems that most existing ports have a bunch of custom code that
manages the GPA->RPA TLB and page tables.

Here is what I would like to try:  Create a mm for the GPA->RPA
mappings each vma would have a fault handler that calls gfn_to_pfn()
to look up the proper page.  In kvm_arch_vcpu_ioctl_run() we would
call switch_mm() to this new 'gva_mm'.

gfn_to_pfn uses the address space of the controlling process. GPA->RPA
translation does:

1) Find 'memory slot' (indexed by gfn)
2) From 'memory slot', find virtual address (relative to controlling
process).
3) Walk pagetable of controlling process and page retrieve physical address.

Actually, it kind of works.  Here is the vm_operations_struct for the 
VMAs in the guest MM using this technique:

static int kvm_mipsvz_host_fault(struct vm_area_struct *vma, struct 
vm_fault *vmf)
{
	struct page *page[1];
	unsigned long addr;
	int npages;
	struct kvm *kvm = vma->vm_private_data;
	gfn_t gfn = vmf->pgoff + (vma->vm_start >> PAGE_SHIFT);

	addr = gfn_to_hva(kvm, gfn);
	if (kvm_is_error_hva(addr))
		return VM_FAULT_SIGBUS;

	npages = get_user_pages(current, kvm->arch.host_mm, addr, 1, 1, 0, page,
				NULL);
	if (unlikely(npages != 1))
		return VM_FAULT_SIGBUS;

	vmf->page = page[0];
	return 0;
}

static const struct vm_operations_struct kvm_mipsvz_host_ops = {
	.fault =  kvm_mipsvz_host_fault
};

Most likely this screws up the page reference counts in a manner that 
will leak pages.  But the existing mm infrastructure is managing the 
page tables so that the pages show up in the proper place in the guest.

That said, I think I will switch to a more conventional approach where 
the guest page tables are manages outside of the kernel's struct 
mm_struct framework.  What I did, works for memory, but I think it will 
be very difficult to implement trap-and-emulate on memory references 
this way.

  Upon exiting guest mode we
would switch back to the original mm of the controlling process.
For me the benefit of this approach is that all the code that
manages the TLB is already implemented and works well for struct
mm_struct.  The only thing I need to do is write a vma fault
handler.  That is a lot easier and less error prone than maintaining
a parallel TLB management framework and making sure it interacts
properly with the existing TLB code for 'normal' processes.

Q1: Am I crazy for wanting to try this?

You need the mm_struct of the controlling to be active, when doing
GPA->RPA translations.

Q2: Have others tried this and rejected it?  What were the reasons?

I think you'll have to switch_mm back to the controlling process mm on
every page fault (and then back to gva_mm).

Thanks in advance,
David Daney
Cavium, Inc.

'vma' `is a process

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html