Re: [PATCH] x86/kvm: Handle async page faults directly through do_page_fault()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 28/02/20 19:42, Andy Lutomirski wrote:
> KVM overloads #PF to indicate two types of not-actually-page-fault
> events.  Right now, the KVM guest code intercepts them by modifying
> the IDT and hooking the #PF vector.  This makes the already fragile
> fault code even harder to understand, and it also pollutes call
> traces with async_page_fault and do_async_page_fault for normal page
> faults.
> 
> Clean it up by moving the logic into do_page_fault() using a static
> branch.  This gets rid of the platform trap_init override mechanism
> completely.
> 
> Signed-off-by: Andy Lutomirski <luto@xxxxxxxxxx>

Acked-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>

Just one thing:

> @@ -1505,6 +1506,25 @@ do_page_fault(struct pt_regs *regs, unsigned long hw_error_code,
>  		unsigned long address)
>  {
>  	prefetchw(&current->mm->mmap_sem);
> +	/*
> +	 * KVM has two types of events that are, logically, interrupts, but
> +	 * are unfortunately delivered using the #PF vector.

At least the not-present case isn't entirely an interrupt because it
must be delivered precisely.  Regarding the page-ready case you're
right, it could be an interrupt. However, generally speaking this is not
a problem.  Using something in memory rather than overloading the error
code was the mistake.

> +      * These events are
> +	 * "you just accessed valid memory, but the host doesn't have it right
> +	 * not, so I'll put you to sleep if you continue" and "that memory
> +	 * you tried to access earlier is available now."
> +	 *
> +	 * We are relying on the interrupted context being sane (valid
> +	 * RSP, relevant locks not held, etc.), which is fine as long as
> +	 * the the interrupted context had IF=1.

This is not about IF=0/IF=1; the KVM code is careful about taking
spinlocks only with IRQs disabled, and async PF is not delivered if the
interrupted context had IF=0.  The problem is that the memory location
is not reentrant if an NMI is delivered in the wrong window, as you hint
below.

Paolo

> We are also relying on
> +	 * the KVM async pf type field and CR2 being read consistently
> +	 * instead of getting values from real and async page faults
> +	 * mixed up.
> +	 *
> +	 * Fingers crossed.
> +	 */
> +	if (kvm_handle_async_pf(regs, hw_error_code, address))
> +		return;
> +
>  	trace_page_fault_entries(regs, hw_error_code, address);
>  
>  	if (unlikely(kmmio_fault(regs, address)))
> 




[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux