* Peter Xu <peterx@xxxxxxxxxx> wrote: > This patch provides a ~12% perf boost on my aarch64 test VM with a simple > program sequentially dirtying 400MB shmem file being mmap()ed and these are > the time it needs: > > Before: 650.980 ms (+-1.94%) > After: 569.396 ms (+-1.38%) Nice! > arch/x86/mm/fault.c | 4 ++++ Reviewed-by: Ingo Molnar <mingo@xxxxxxxxxx> Minor comment typo: > + /* > + * We should do the same as VM_FAULT_RETRY, but let's not > + * return -EBUSY since that's not reflecting the reality on > + * what has happened - we've just fully completed a page > + * fault, with the mmap lock released. Use -EAGAIN to show > + * that we want to take the mmap lock _again_. > + */ s/reflecting the reality on what has happened /reflecting the reality of what has happened > ret = handle_mm_fault(vma, address, fault_flags, NULL); > + > + if (ret & VM_FAULT_COMPLETED) { > + /* > + * NOTE: it's a pity that we need to retake the lock here > + * to pair with the unlock() in the callers. Ideally we > + * could tell the callers so they do not need to unlock. > + */ > + mmap_read_lock(mm); > + *unlocked = true; > + return 0; Indeed that's a pity - I guess more performance could be gained here, especially in highly parallel threaded workloads? Thanks, Ingo