Re: Can someone explain what free_pgd_range(), etc actually do?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 11/03/2017 05:11 AM, Andy Lutomirski wrote:
>  - What is the intended purpose of addr, end, floor, and ceiling?
> What are the pagetable freeing functions actually *supposed* to do?

I've always logically thought of it as: the VMA (and this addr/end) tell
us where we _must_ walk and free.  floor/ceiling tell us about
neighboring areas that are unused.  We do not have to walk the unused
areas, but we must free them if we clear out their last use.

Walking is presumably expensive.  We use the VMA information and plumb
it down through floor/ceiling to make sure that we're not having to look
at a full page of data at each level every time we free a VMA.

I think that might be what's tripping you up: floor/ceiling is just an
optimization.  It's not logically required for freeing page tables, but
it does speed things up.

>  - Are there any invariants that, for example, there is never a
> pagetable that doesn't have any vmas at all under it?  I can
> understand how all the code would be correct if this invariant were to
> exist, but I don't see what would preserve it.  But maybe
> free_pgd_range(), etc really do preserve it.

I think it's implemented more like: the last VMA using a page table will
free the page table when the VMA is torn down.  It does this by looking
at its neighbors (or lack thereof) at unmap_region() time and expanding
the range covered by floor/ceiling.

>  - What keeps mm->mmap pointing to the lowest-addressed vma?  I see
> lots of code that seems to assume that you can start at mm->mmap,
> follow the vm_next links, and find all vmas, but I can't figure out
> why this would work.

__vma_(un)link_list() is where the magic normally happens.  It
effectively uses the rbtree to determine where to put the VMA in the
list to maintain ordering.

>  - What happens if a process exits while mm->mmap is NULL?

You mean how do we free the page tables for it?  We had to do a bunch of
unmap_regions() before that to axe all the VMAs and the page tables
_should_ have zapped then.

Now, if someone goes and just sets mm->mmap, we're obviously screwed,
but we leaked a bunch of VMAs _anyway_, in addition to the page tables.

>  - Is there any piece of code that makes it obvious that all the
> pagetables are gone by the time the exit_mmap() finishes?

mm->nr_ptes and mm->nr_pmds (and soon nr_puds) should tell us if we
forgot to free one.  I think that's our main defense.

I have some vague recollection that we also looked for zero'd page table
pages somewhere at free time, but I'm not finding it.

> Because I'm staring to wonder whether some weird combination of maps
> and unmaps will just leak pagetables, and the code is rather
> complicated, subtle, and completely lacking in documentation, and I've
> learned to be quite suspicious of such things.
There have surely been bugs.  FWIW, there's some code in the MPX
selftests that tries to map and free a bunch of random addresses to trip
up the MPX code.  I ran it a *lot* and this code never got tripped up on
it that I can remember.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxx.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]
  Powered by Linux