On Wed 12-09-18 09:50:54, Michal Hocko wrote: > On Tue 11-09-18 23:01:57, Tetsuo Handa wrote: > > On 2018/09/10 21:55, Michal Hocko wrote: > > > This is a very coarse implementation of the idea I've had before. > > > Please note that I haven't tested it yet. It is mostly to show the > > > direction I would wish to go for. > > > > Hmm, this patchset does not allow me to boot. ;-) > > > > free_pgd_range(&tlb, vma->vm_start, vma->vm_prev->vm_end, > > FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); > > > > [ 1.875675] sched_clock: Marking stable (1810466565, 65169393)->(1977240380, -101604422) > > [ 1.877833] registered taskstats version 1 > > [ 1.877853] Loading compiled-in X.509 certificates > > [ 1.878835] zswap: loaded using pool lzo/zbud > > [ 1.880835] BUG: unable to handle kernel NULL pointer dereference at 0000000000000008 > > This is vm_prev == NULL. I thought we always have vm_prev as long as > this is not a single VMA in the address space. I will double check this. So this is me misunderstanding the code. vm_next, vm_prev are not a full doubly linked list. The first entry doesn't really refer to the last entry. So the above cannot work at all. We can go around this in two ways. Either keep the iteration or use the following which should cover the full mapped range, unless I am missing something diff --git a/mm/mmap.c b/mm/mmap.c index 64e8ccce5282..078295344a17 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3105,7 +3105,7 @@ void exit_mmap(struct mm_struct *mm) up_write(&mm->mmap_sem); } - free_pgd_range(&tlb, vma->vm_start, vma->vm_prev->vm_end, + free_pgd_range(&tlb, vma->vm_start, mm->highest_vm_end, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); tlb_finish_mmu(&tlb, 0, -1); -- Michal Hocko SUSE Labs