On Tue, May 21, 2019 at 9:51 AM Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx> wrote: > > On Tue, 2019-05-21 at 09:17 -0700, Andy Lutomirski wrote: > > On Mon, May 20, 2019 at 4:39 PM Rick Edgecombe > > <rick.p.edgecombe@xxxxxxxxx> wrote: > > > From: Rick Edgecombe <redgecombe.lkml@xxxxxxxxx> > > > > > > Calling vm_unmap_alias() in vm_remove_mappings() could potentially > > > be a > > > lot of work to do on a free operation. Simply flushing the TLB > > > instead of > > > the whole vm_unmap_alias() operation makes the frees faster and > > > pushes > > > the heavy work to happen on allocation where it would be more > > > expected. > > > In addition to the extra work, vm_unmap_alias() takes some locks > > > including > > > a long hold of vmap_purge_lock, which will make all other > > > VM_FLUSH_RESET_PERMS vfrees wait while the purge operation happens. > > > > > > Lastly, page_address() can involve locking and lookups on some > > > configurations, so skip calling this by exiting out early when > > > !CONFIG_ARCH_HAS_SET_DIRECT_MAP. > > > > Hmm. I would have expected that the major cost of vm_unmap_aliases() > > would be the flush, and at least informing the code that the flush > > happened seems valuable. So would guess that this patch is actually > > a > > loss in throughput. > > > You are probably right about the flush taking the longest. The original > idea of using it was exactly to improve throughput by saving a flush. > However with vm_unmap_aliases() the flush will be over a larger range > than before for most arch's since it will likley span from the module > space to vmalloc. From poking around the sparc tlb flush history, I > guess the lazy purges used to be (still are?) a problem for them > because it would try to flush each page individually for some CPUs. Not > sure about all of the other architectures, but for any implementation > like that, using vm_unmap_alias() would turn an occasional long > operation into a more frequent one. > > On x86, it shouldn't be a problem to use it. We already used to call > this function several times around a exec permission vfree. > > I guess its a tradeoff that depends on how fast large range TLB flushes > usually are compared to small ones. I am ok dropping it, if it doesn't > seem worth it. On x86, a full flush is probably not much slower than just flushing a page or two -- the main cost is in the TLB refill. I don't know about other architectures. I would drop this patch unless you have numbers suggesting that it's a win.