Re: unmap_range() premature exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 09, 2014 at 10:42:11AM -0700, Mario Smarduch wrote:
> Hi Steve, Marc -
>    couple comments.
> 
> - unmap_range() is used by mmu notifiers, and appears to remove
> a range much larger then requested - goes upto PUD. It should
> probably be more precise in the range in invalidates, it may take
> longer but result in fewer 2nd stage faults under heavier loads. 
> May require splitting up PMDs. When I was doing live migration 
> tests putting a heavier load causing memory fragmentation and 
> swapping this led to a lot of 2nd stage faults. Removing upto 
> PUD range may be ok if you get huge pages back on subsequent 
> faults, but if system is fragmented, or you enable fetures like same 
> page merging this may cause excessive overhead.

It needs to be fixed, take a look at the other mail I sent, I will try
to make enough time to test this thing.

> 
> - The clear_pud_entry(), pmd, pte functions call put_page() on a
> page table entry. It appears if you'll do this several times for
> entries in the same table you'll be putting a page with count = 0.
> 

No, we get the page each time we add an entry and we put the page every
time we clear an entry.  So if you have 512 possible entries in a PMD,
if all of them points to a table or are block descriptors (no empty
entries in the PMD), the count will be 512 (one for each entry, and one
to keep the page reserved in the first place).  Every time a pte is
completely empty, the parent PMD's entry is cleared out and the refcount
is dropped one.  When finally the count is 1, we know there are no
entries in the table in question, and we can safely free it (by putting
the refcount once more, returning it to 0).

-Christoffer

> 
> 
> On 05/09/2014 01:25 AM, Marc Zyngier wrote:
> > On Fri, May 09 2014 at  8:58:18 am BST, Steve Capper <Steve.Capper@xxxxxxx> wrote:
> > 
> > Hi Steve,
> > 
> >> Hi,
> >> I've taken a quick look at this, I see:
> >> #define kvm_pud_addr_end(addr,end)              (end)
> >>
> >> I think it should besomething like:
> >> #define kvm_pud_addr_end(addr, end)                                     \
> >> ({      u64 __boundary = ((addr) + PUD_SIZE) & PUD_MASK;                \
> >>         (__boundary - 1 < (end) - 1)? __boundary: (end);                \
> >> })
> >>
> >> Otherwise we'll jump to end, even if end is greater than the end of the
> >> pud we're examining...
> > 
> > Certainly. But we're looking at a case where we have only 3 levels of
> > page tables, hence no PUD. My expectations would be that we're never
> > going to pass kvm_pud_addr_end anything but the result of
> > kvm_pgd_addr_end, so we wouldn't need the "bells and whistles" version
> > on ARMv7. Am I missing something obvious here?
> > 
> >> As it stands, unmap_range does not appear to handle zero or block
> >> entries for the first level descriptor. That coupled with the
> >> kvm_pud_addr_end definition will mean it can exit prematurely.
> >>
> >> pgd_none isn't checked for (which would be a problem for 4-levels), and
> >> the function could probably benefit from being split up into pgd, pud,
> >> pmd and pte levels. 
> > 
> > Yes, unmap_range is utterly borken. I wasn't reading Mario's email
> > properly (too much GICv3 in the morning... ;-).
> > 
> > Steve, would you be willing to submit a patch for that?
> > 
> > Thanks,
> > 
> > 	M.
> > 
> 
> _______________________________________________
> kvmarm mailing list
> kvmarm@xxxxxxxxxxxxxxxxxxxxx
> https://lists.cs.columbia.edu/mailman/listinfo/kvmarm
_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux