On Wed, May 15, 2019 at 01:11:46PM +0530, Anshuman Khandual wrote: > > > On 05/15/2019 05:21 AM, Roman Gushchin wrote: > > __vunmap() calls find_vm_area() twice without an obvious reason: > > first directly to get the area pointer, second indirectly by calling > > vm_remove_mappings()->remove_vm_area(), which is again searching > > for the area. > > > > To remove this redundancy, let's split remove_vm_area() into > > __remove_vm_area(struct vmap_area *), which performs the actual area > > removal, and remove_vm_area(const void *addr) wrapper, which can > > be used everywhere, where it has been used before. Let's pass > > a pointer to the vm_area instead of vm_struct to vm_remove_mappings(), > > so it can pass it to __remove_vm_area() and avoid the redundant area > > lookup. > > > > On my test setup, I've got 5-10% speed up on vfree()'ing 1000000 > > of 4-pages vmalloc blocks. > > > > Perf report before: > > 29.44% cat [kernel.kallsyms] [k] free_unref_page > > 11.88% cat [kernel.kallsyms] [k] find_vmap_area > > 9.28% cat [kernel.kallsyms] [k] __free_pages > > 7.44% cat [kernel.kallsyms] [k] __slab_free > > 7.28% cat [kernel.kallsyms] [k] vunmap_page_range > > 4.56% cat [kernel.kallsyms] [k] __vunmap > > 3.64% cat [kernel.kallsyms] [k] __purge_vmap_area_lazy > > 3.04% cat [kernel.kallsyms] [k] __free_vmap_area > > > > Perf report after: > > 32.41% cat [kernel.kallsyms] [k] free_unref_page > > 7.79% cat [kernel.kallsyms] [k] find_vmap_area > > 7.40% cat [kernel.kallsyms] [k] __slab_free > > 7.31% cat [kernel.kallsyms] [k] vunmap_page_range > > 6.84% cat [kernel.kallsyms] [k] __free_pages > > 6.01% cat [kernel.kallsyms] [k] __vunmap > > 3.98% cat [kernel.kallsyms] [k] smp_call_function_single > > 3.81% cat [kernel.kallsyms] [k] __purge_vmap_area_lazy > > 2.77% cat [kernel.kallsyms] [k] __free_vmap_area > > > > Signed-off-by: Roman Gushchin <guro@xxxxxx> > > Cc: Johannes Weiner <hannes@xxxxxxxxxxx> > > Cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > > Cc: Vlastimil Babka <vbabka@xxxxxxx> > > --- > > mm/vmalloc.c | 52 +++++++++++++++++++++++++++++----------------------- > > 1 file changed, 29 insertions(+), 23 deletions(-) > > > > diff --git a/mm/vmalloc.c b/mm/vmalloc.c > > index c42872ed82ac..8d4907865614 100644 > > --- a/mm/vmalloc.c > > +++ b/mm/vmalloc.c > > @@ -2075,6 +2075,22 @@ struct vm_struct *find_vm_area(const void *addr) > > return NULL; > > } > > > > +static struct vm_struct *__remove_vm_area(struct vmap_area *va) > > +{ > > + struct vm_struct *vm = va->vm; > > + > > + spin_lock(&vmap_area_lock); > > + va->vm = NULL; > > + va->flags &= ~VM_VM_AREA; > > + va->flags |= VM_LAZY_FREE; > > + spin_unlock(&vmap_area_lock); > > + > > + kasan_free_shadow(vm); > > + free_unmap_vmap_area(va); > > + > > + return vm; > > +} > > + > > /** > > * remove_vm_area - find and remove a continuous kernel virtual area > > * @addr: base address > > @@ -2087,26 +2103,14 @@ struct vm_struct *find_vm_area(const void *addr) > > */ > > struct vm_struct *remove_vm_area(const void *addr) > > { > > + struct vm_struct *vm = NULL; > > struct vmap_area *va; > > > > - might_sleep(); > > Is not this necessary any more ? We've discussed it here: https://lkml.org/lkml/2019/4/17/1098 . Tl;dr it's not that useful. > > > - > > va = find_vmap_area((unsigned long)addr); > > - if (va && va->flags & VM_VM_AREA) { > > - struct vm_struct *vm = va->vm; > > - > > - spin_lock(&vmap_area_lock); > > - va->vm = NULL; > > - va->flags &= ~VM_VM_AREA; > > - va->flags |= VM_LAZY_FREE; > > - spin_unlock(&vmap_area_lock); > > - > > - kasan_free_shadow(vm); > > - free_unmap_vmap_area(va); > > + if (va && va->flags & VM_VM_AREA) > > + vm = __remove_vm_area(va); > > > > - return vm; > > - } > > - return NULL; > > + return vm; > > } > > Other callers of remove_vm_area() cannot use __remove_vm_area() directly as well > to save a look up ? > I'll take a look. Good idea, thanks! Roman