On Wed, Jan 05, 2022 at 08:17:08PM -0500, Daniel Jordan wrote: > On Wed, Jan 05, 2022 at 08:53:39PM -0400, Jason Gunthorpe wrote: > > On Wed, Jan 05, 2022 at 07:46:48PM -0500, Daniel Jordan wrote: > > > padata threads hold mmap_lock as reader for the majority of their > > > runtime in order to call pin_user_pages_remote(), but they also > > > periodically take mmap_lock as writer for short periods to adjust > > > mm->locked_vm, hurting parallelism. > > > > > > Alleviate the write-side contention with a per-thread cache of locked_vm > > > which allows taking mmap_lock as writer far less frequently. > > > > > > Failure to refill the cache due to insufficient locked_vm will not cause > > > the entire pinning operation to error out. This avoids spurious failure > > > in case some pinned pages aren't accounted to locked_vm. > > > > > > Cache size is limited to provide some protection in the unlikely event > > > of a concurrent locked_vm accounting operation in the same address space > > > needlessly failing in case the cache takes more locked_vm than it needs. > > > > Why not just do the pinned page accounting once at the start? Why does > > it have to be done incrementally? > > Yeah, good question. I tried doing it that way recently and it did > improve performance a bit, but I thought it wasn't enough of a gain to > justify how it overaccounted by the size of the entire pin. Why would it over account? Jason