On Tue, Apr 14, 2020 at 07:03:12PM +0200, Michal Hocko wrote: > On Tue 14-04-20 21:08:32, Prathu Baronia wrote: > > In !HIGHMEM cases, specially in 64-bit architectures, we don't need temp mapping > > of pages. Hence, k(map|unmap)_atomic() acts as nothing more than multiple > > barrier() calls, for example for a 2MB hugepage in clear_huge_page() these are > > called 512 times i.e. to map and unmap each subpage that means in total 2048 > > barrier calls. I think barrier() only matters at compile time. > > This called for optimization. Simply getting VADDR from page does > > the job for us. This also applies to the copy_user_huge_page() function. > > I still have hard time to see why kmap machinery should introduce any > slowdown here. Previous data posted while discussing v1 didn't really > show anything outside of the noise. Maybe pagefault_disable/enable are barely showing up. Alternatively, do you have CONFIG_PREEMPT_COUNT enabled?