* Peter Xu <peterx@xxxxxxxxxx> wrote:
This patch provides a ~12% perf boost on my aarch64 test VM with a simple program sequentially dirtying 400MB shmem file being mmap()ed and these are the time it needs: Before: 650.980 ms (+-1.94%) After: 569.396 ms (+-1.38%)
Nice!
arch/x86/mm/fault.c | 4 ++++
Reviewed-by: Ingo Molnar <mingo@xxxxxxxxxx> Minor comment typo:
+ /* + * We should do the same as VM_FAULT_RETRY, but let's not + * return -EBUSY since that's not reflecting the reality on + * what has happened - we've just fully completed a page + * fault, with the mmap lock released. Use -EAGAIN to show + * that we want to take the mmap lock _again_. + */
s/reflecting the reality on what has happened /reflecting the reality of what has happened
ret = handle_mm_fault(vma, address, fault_flags, NULL); + + if (ret & VM_FAULT_COMPLETED) { + /* + * NOTE: it's a pity that we need to retake the lock here + * to pair with the unlock() in the callers. Ideally we + * could tell the callers so they do not need to unlock. + */ + mmap_read_lock(mm); + *unlocked = true; + return 0;
Indeed that's a pity - I guess more performance could be gained here, especially in highly parallel threaded workloads? Thanks, Ingo