Re: [RFC PATCH] KVM: remove the writable page for read fault case in hva_to_pfn_slow()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, 2018-10-08 at 16:11 +0000, kvm-owner@xxxxxxxxxxxxxxx wrote:
> On Mon, Oct 08, 2018 at 08:21:53AM -0700, Sean Christopherson wrote:
> > 
> > On Mon, 2018-10-08 at 22:41 +0800, kvm-owner@xxxxxxxxxxxxxxx wrote:
> > > 
> > > Case (!write_fault && writable) has been handled in hva_to_pfn_fast(),
> > > it is not necessary to try again if hva_to_pfn_fast() already failed.
> > > 
> > > This patch removes this case in hva_to_pfn_slow().
> > > 
> > > Signed-off-by: Wei Yang <richard.weiyang@xxxxxxxxx>
> > > 
> > > ---
> > > 
> > > Hope my understanding is correct.
> > > 
> > > ---
> > > ??virt/kvm/kvm_main.c | 10 ----------
> > > ??1 file changed, 10 deletions(-)
> > > 
> > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> > > index 1f42f1d474b5..c8fb3a9d81fa 100644
> > > --- a/virt/kvm/kvm_main.c
> > > +++ b/virt/kvm/kvm_main.c
> > > @@ -1403,16 +1403,6 @@ static int hva_to_pfn_slow(unsigned long addr, bool *async, bool write_fault,
> > > ??	if (npages != 1)
> > > ??		return npages;
> > > ??
> > > -	/* map read fault as writable if possible */
> > > -	if (unlikely(!write_fault) && writable) {
> > > -		struct page *wpage;
> > > -
> > > -		if (__get_user_pages_fast(addr, 1, 1, &wpage) == 1) {
> > > -			*writable = true;
> > > -			put_page(page);
> > > -			page = wpage;
> > > -		}
> > > -	}
> Hi, Sean
> 
> Thanks for your comments, let me try to understand it.
> 
> > 
> > This handles the scenario where __get_user_pages_fast() failed and
> > get_user_pages_unlocked() succeeded, obviously with !write_fault &&
> > writable. ??On a read fault, get_user_pages_unlocked() requests the
> > page with read permissions to ensure it doesn't incorrectly fail due
> > to invalid access permissions, i.e. doesn't speculatively request
> > write permissions for a read-only VMA. ??If that succeeds, we then
> I think I can understand at this point.
> 
> Based on my understanding, we set *write* to 1 for __get_user_pages_fast()
> which means we want a writable page. While in hva_to_pfn_slow(),
> FOLL_WRITE is not set for get_user_pages_unlocked(), this means a
> readable page would meet our need.
> 
> In case a read fault, if __get_user_pages_fast() failed, we still could
> try get_user_pages_unlocked() to get a readable page.
> 
> > 
> > use __get_user_pages_fast() to test if the VMA is writable. ??If the
> > VMA is indeed writable, we can create the shadow PTE with write
> > permissions even though we're handling a read fault. ??Marking the
> I can't follow up with you from this point.
> 
> __get_user_pages_fast() iterate the page table and check those
> requirements. We pass the same parameter both in hva_to_pfn_slow() and
> hva_to_pfn_fast(). Curious about why we would get different result if no
> one else change the PTE?

hva_to_pfn_slow() uses the exact access type (read vs write) when calling
get_user_pages_unlocked(), i.e. FOLL_WRITE is only set if @write_fault is
true, whereas the calls to __get_user_pages_fast() always try to pin the
page as writable.

> BTW, you mentioned __get_user_pages_fast() would test the VMA? Would you
> mind pointing which part does this? I am lost at this point.

That was poorly worded.  Let me try again:

If get_user_pages_unlocked() succeeds and we're handling a read-fault,
use __get_user_pages_fast() to check if the page was mapped writable.
Even though we only request read permissions, the PTE *may* have write
permissions if the corresponding VMA is writable, e.g. a writable VMA
will generate a read-only PTE when the page is backed by the zero page
using copy-on-write semantics.

If the PTE is indeed writable, we can also create the shadow PTE with
write permissions even though we're handling a read fault.  Marking
the SPTE as writable means we don't take another page fault if/when
the guest writes the page, e.g. in a RMW scenario we take one page
fault instead of two.

> > 
> > SPTE as writable means we don't take another page fault if/when the
> > guest writes the page, e.g. in a RMW scenario we take one page fault
> > instead of two.
> > 
> Ah, thanks. I guess this is the benefits for this behavior. This reduces
> one page fault by giving a writable page for a read fault.
> 
> > 
> > > 
> > > ??	*pfn = page_to_pfn(page);
> > > ??	return npages;
> > > ??}



[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux