On Sun, 10 Jan 2021, Al Viro wrote: > On Sun, Jan 10, 2021 at 04:14:55PM -0500, Mikulas Patocka wrote: > > > That's a good point. I split nvfs_rw_iter to separate functions > > nvfs_read_iter and nvfs_write_iter - and inlined nvfs_rw_iter_locked into > > both of them. It improved performance by 1.3%. > > > > > Not that it had been more useful on the write side, really, > > > but that's another story (nvfs_write_pages() handling of > > > copyin is... interesting). Let's figure out what's going > > > on with the read overhead first... > > > > > > lib/iov_iter.c primitives certainly could use massage for > > > better code generation, but let's find out how much of the > > > PITA is due to those and how much comes from you fighing > > > the damn thing instead of using it sanely... > > > > The results are: > > > > read: 6.744s > > read_iter: 7.417s > > read_iter - separate read and write path: 7.321s > > Al's read_iter: 7.182s > > Al's read_iter with _copy_to_iter: 7.181s > > So > * overhead of hardening stuff is noise here > * switching to more straightforward ->read_iter() cuts > the overhead by about 1/3. > > Interesting... I wonder how much of that is spent in > iterate_and_advance() glue inside copy_to_iter() here. There's > certainly quite a bit of optimizations possible in those > primitives and your usecase makes a decent test for that... > > Could you profile that and see where is it spending > the time, on instruction level? This is the read method profile: time 9.056s 52.69% pread [kernel.vmlinux] [k] copy_user_generic_string 6.24% pread [kernel.vmlinux] [k] current_time 6.22% pread [kernel.vmlinux] [k] entry_SYSCALL_64 4.88% pread libc-2.31.so [.] __libc_pread 3.75% pread [kernel.vmlinux] [k] syscall_return_via_sysret 3.63% pread [nvfs] [k] nvfs_read 2.83% pread [nvfs] [k] nvfs_bmap 2.81% pread [kernel.vmlinux] [k] vfs_read 2.63% pread [kernel.vmlinux] [k] __x64_sys_pread64 2.27% pread [kernel.vmlinux] [k] __fsnotify_parent 2.19% pread [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe 1.55% pread [kernel.vmlinux] [k] atime_needs_update 1.17% pread [kernel.vmlinux] [k] syscall_enter_from_user_mode 1.15% pread [kernel.vmlinux] [k] touch_atime 0.84% pread [kernel.vmlinux] [k] down_read 0.82% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode 0.71% pread [kernel.vmlinux] [k] do_syscall_64 0.68% pread [kernel.vmlinux] [k] ktime_get_coarse_real_ts64 0.66% pread [kernel.vmlinux] [k] __fget_light 0.53% pread [kernel.vmlinux] [k] exit_to_user_mode_prepare 0.45% pread [kernel.vmlinux] [k] up_read 0.44% pread pread [.] main 0.44% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare 0.26% pread [kernel.vmlinux] [k] entry_SYSCALL_64_safe_stack 0.12% pread pread [.] pread@plt 0.07% pread [kernel.vmlinux] [k] __fdget 0.00% perf [kernel.vmlinux] [k] x86_pmu_enable_all This is profile of "read_iter - separate read and write path": time 10.058s 53.05% pread [kernel.vmlinux] [k] copy_user_generic_string 6.82% pread [kernel.vmlinux] [k] current_time 6.27% pread [nvfs] [k] nvfs_read_iter 4.70% pread [kernel.vmlinux] [k] entry_SYSCALL_64 3.20% pread libc-2.31.so [.] __libc_pread 2.77% pread [kernel.vmlinux] [k] syscall_return_via_sysret 2.31% pread [kernel.vmlinux] [k] vfs_read 2.15% pread [kernel.vmlinux] [k] new_sync_read 2.06% pread [kernel.vmlinux] [k] __fsnotify_parent 2.02% pread [nvfs] [k] nvfs_bmap 1.87% pread [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe 1.86% pread [kernel.vmlinux] [k] iov_iter_advance 1.62% pread [kernel.vmlinux] [k] __x64_sys_pread64 1.40% pread [kernel.vmlinux] [k] atime_needs_update 0.99% pread [kernel.vmlinux] [k] syscall_enter_from_user_mode 0.85% pread [kernel.vmlinux] [k] touch_atime 0.85% pread [kernel.vmlinux] [k] down_read 0.84% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode 0.78% pread [kernel.vmlinux] [k] ktime_get_coarse_real_ts64 0.65% pread [kernel.vmlinux] [k] __fget_light 0.57% pread [kernel.vmlinux] [k] exit_to_user_mode_prepare 0.53% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare 0.45% pread pread [.] main 0.43% pread [kernel.vmlinux] [k] up_read 0.43% pread [kernel.vmlinux] [k] do_syscall_64 0.28% pread [kernel.vmlinux] [k] iov_iter_init 0.16% pread [kernel.vmlinux] [k] entry_SYSCALL_64_safe_stack 0.09% pread pread [.] pread@plt 0.03% pread [kernel.vmlinux] [k] __fdget 0.00% pread [kernel.vmlinux] [k] update_rt_rq_load_avg 0.00% perf [kernel.vmlinux] [k] x86_pmu_enable_all This is your read_iter_locked profile (read_iter_locked is inlined to nvfs_read_iter): time 10.056s 50.71% pread [kernel.vmlinux] [k] copy_user_generic_string 6.95% pread [kernel.vmlinux] [k] current_time 5.22% pread [kernel.vmlinux] [k] entry_SYSCALL_64 4.29% pread libc-2.31.so [.] __libc_pread 4.17% pread [nvfs] [k] nvfs_read_iter 3.20% pread [kernel.vmlinux] [k] syscall_return_via_sysret 2.66% pread [kernel.vmlinux] [k] _copy_to_iter 2.44% pread [kernel.vmlinux] [k] __x64_sys_pread64 2.38% pread [kernel.vmlinux] [k] new_sync_read 2.37% pread [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe 2.26% pread [kernel.vmlinux] [k] vfs_read 2.02% pread [nvfs] [k] nvfs_bmap 1.88% pread [kernel.vmlinux] [k] __fsnotify_parent 1.46% pread [kernel.vmlinux] [k] atime_needs_update 1.08% pread [kernel.vmlinux] [k] touch_atime 0.83% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode 0.82% pread [kernel.vmlinux] [k] syscall_enter_from_user_mode 0.75% pread [kernel.vmlinux] [k] syscall_exit_to_user_mode_prepare 0.73% pread [kernel.vmlinux] [k] __fget_light 0.65% pread [kernel.vmlinux] [k] down_read 0.58% pread pread [.] main 0.58% pread [kernel.vmlinux] [k] exit_to_user_mode_prepare 0.52% pread [kernel.vmlinux] [k] ktime_get_coarse_real_ts64 0.48% pread [kernel.vmlinux] [k] up_read 0.42% pread [kernel.vmlinux] [k] do_syscall_64 0.28% pread [kernel.vmlinux] [k] iov_iter_init 0.13% pread [kernel.vmlinux] [k] __fdget 0.12% pread [kernel.vmlinux] [k] entry_SYSCALL_64_safe_stack 0.03% pread pread [.] pread@plt 0.00% perf [kernel.vmlinux] [k] x86_pmu_enable_all Mikulas