Re: unmap_range() premature exit

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 08, 2014 at 09:27:05PM +0100, Mario Smarduch wrote:
> On 05/08/2014 11:36 AM, Mario Smarduch wrote:
> > On 05/08/2014 11:14 AM, Marc Zyngier wrote:
> >> On Thu, May 08 2014 at  6:25:29 pm BST, Mario Smarduch <m.smarduch@xxxxxxxxxxx> wrote:
> >>> On 05/08/2014 10:12 AM, Marc Zyngier wrote:
> >>>> On Thu, May 08 2014 at  6:03:07 pm BST, Mario Smarduch <m.smarduch@xxxxxxxxxxx> wrote:
> >>>>> It appears that for a memslot that crosses a PUD range unmap_range may
> >>>>> return prematurely if lower pud is not mapping anything. 
> >>>>>
> >>>>> kvm_pud_addr_end(..) will return end and the loop will terminate possibly
> >>>>> leaving unmapped ranges. 
> >>>>>
> >>>>> Am I missing something?
> >>>>
> >>>> With 3-level page tables (which is what we have with LPAE on ARMv7),
> >>>> puds are the same thing as pgds (they are effectively collapsed). This
> >>>> will change with the introduction of 4-level page tables on ARMv8.
> >>>>
> >>>> 	M.
> >>>>
> >>>
> >>> But what if there is some memslot with range like
> >>> 0xffff0000 - 0x1000f0000. The lower range has not paged anything
> >>> in so its pud/pgd is none, but upper range has mappings.
> >>> Then pud_none(...) will succeed for lower PUD range and
> >>> kvm_pud_addr_end() will return 'end' and leave the upper mappings
> >>> stranded.
> >>
> >> There is something I don't understand. The usual loop would be something
> >> like (boilerplate code taken from the current KVM code base):
> >>
> >> <quote>
> >> static void stage2_flush_puds(struct kvm *kvm, pgd_t *pgd,
> >> 			      phys_addr_t addr, phys_addr_t end)
> >> {
> >> 	pud_t *pud;
> >> 	phys_addr_t next;
> >>
> >> 	pud = pud_offset(pgd, addr);
> >> 	do {
> >> 		next = kvm_pud_addr_end(addr, end);
> >> 		if (!pud_none(*pud)) {
> >>                 	[...]
> >> 		}
> >> 	} while (pud++, addr = next, addr != end);
> >> }
> >>
> >> static void stage2_flush_memslot(struct kvm *kvm,
> >> 				 struct kvm_memory_slot *memslot)
> >> {
> >> 	phys_addr_t addr = memslot->base_gfn << PAGE_SHIFT;
> >> 	phys_addr_t end = addr + PAGE_SIZE * memslot->npages;
> >> 	phys_addr_t next;
> >> 	pgd_t *pgd;
> >>
> >> 	pgd = kvm->arch.pgd + pgd_index(addr);
> >> 	do {
> >> 		next = kvm_pgd_addr_end(addr, end);
> >> 		stage2_flush_puds(kvm, pgd, addr, next);
> >> 	} while (pgd++, addr = next, addr != end);
> >> }
> >> </quote>
> >>
> >> If your memslot is across two pgd/pud, then kvm_pgd_addr_end will give
> >> you the address of the next range you're interested in, and it will be
> >> picked up at the next iteration.
> >>
> >> /me puzzled.
> >>
> >> 	M.
> >>
> Hi Marc,
> 
> I should have read you comments closer, I'm just basically repeating
> what you're illustrating above.
> 
> - Mario
> 
> > 
> > I think that with nopud, the kvm_pud_addr_end() should be inside a
> > kvm_pgd_addr_end(....). Then kvm_pud_addr_end(...) can return end, which
> > would be pgd end, but the way it is it's returning the 'end'.
> > 
> > while(addr < end) }
> >   pgd = kvm->arch.pgd + pgd_index(addr);
> >   pgd_end = kvm_pgd_addr_end(addr, end);
> >   while(addr < pgd_end) {
> >     pud = pud_offset(pgd,addr);
> >     if(pud_none(*pud)) {
> >        // this should be ok returning pgd end, but not the 'end'
> >        addr = kvm_pud_addr_end(addr, pgd_end);
> >        continue;
> >     }
> >     [....]
> >   }
> >   [ ....]
> > }
> > 

Hi,
I've taken a quick look at this, I see:
#define kvm_pud_addr_end(addr,end)              (end)

I think it should besomething like:
#define kvm_pud_addr_end(addr, end)                                     \
({      u64 __boundary = ((addr) + PUD_SIZE) & PUD_MASK;                \
        (__boundary - 1 < (end) - 1)? __boundary: (end);                \
})

Otherwise we'll jump to end, even if end is greater than the end of the
pud we're examining...

As it stands, unmap_range does not appear to handle zero or block
entries for the first level descriptor. That coupled with the
kvm_pud_addr_end definition will mean it can exit prematurely.

pgd_none isn't checked for (which would be a problem for 4-levels), and
the function could probably benefit from being split up into pgd, pud,
pmd and pte levels. 

Cheers,
-- 
Steve

_______________________________________________
kvmarm mailing list
kvmarm@xxxxxxxxxxxxxxxxxxxxx
https://lists.cs.columbia.edu/mailman/listinfo/kvmarm




[Index of Archives]     [Linux KVM]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux