On Tue, 2018-11-06 at 13:05 -0800, Andrew Morton wrote: > On Fri, 2 Nov 2018 12:25:17 -0700 Rick Edgecombe <rick.p.edgecombe@xxxxxxxxx> > wrote: > > > Create __vmalloc_node_try_addr function that tries to allocate at a specific > > address without triggering any lazy purging. In order to support this > > behavior > > a try_addr argument was plugged into several of the static helpers. > > Please explain (in the changelog) why lazy purging is considered to be > a problem. Preferably with some form of measurements, or at least a > hand-wavy guesstimate of the cost. Sure, Ill update it to be more clear. The problem is that when __vmalloc_node_range fails to allocate (in this case tries in a single random spot that doesn't fit), it triggers a purge_vmap_area_lazy and then retries the allocation in the same spot. It doesn't make as much sense in this case when we are not trying over a large area. While it will usually not flush the TLB, it does do extra work every time for an unlikely case in this situation of a lazy free area blocking the allocation. The average allocation time in ns for different versions measured by the included kselftest: Modules Vmalloc optimization No Vmalloc Optimization Existing Module KASLR 1000 1433 1993 3821 2000 2295 3681 7830 3000 4424 7450 13012 4000 7746 13824 18106 5000 12721 21852 22572 6000 19724 33926 26443 7000 27638 47427 30473 8000 37745 64443 34200 The other optimization is not kmalloc-ing in __get_vm_area_node until after the address was tried, which IIRC had a smaller but still noticeable performance boost. These allocations are not taking very long, but it may show up on systems with very high usage of the module space (BPF JITs). If the trade-off of touching vmalloc doesn't seem worth it to people, I'm happy to remove the optimizations. > > This also changes logic in __get_vm_area_node to be faster in cases where > > allocations fail due to no space, which is a lot more common when trying > > specific addresses.