On Wed, Oct 5, 2022 at 5:07 PM Steven Rostedt <rostedt@xxxxxxxxxxx> wrote: > > On Wed, 5 Oct 2022 22:54:15 +0800 > Xu Kuohai <xukuohai@xxxxxxxxxx> wrote: > > > 1.3 attach bpf prog with with direct call, bpftrace -e 'kfunc:vfs_write {}' > > > > # dd if=/dev/zero of=/dev/null count=1000000 > > 1000000+0 records in > > 1000000+0 records out > > 512000000 bytes (512 MB, 488 MiB) copied, 1.72973 s, 296 MB/s > > > > > > 1.4 attach bpf prog with with indirect call, bpftrace -e 'kfunc:vfs_write {}' > > > > # dd if=/dev/zero of=/dev/null count=1000000 > > 1000000+0 records in > > 1000000+0 records out > > 512000000 bytes (512 MB, 488 MiB) copied, 1.99179 s, 257 MB/s Thanks for the measurements Xu! > Can you show the implementation of the indirect call you used? Xu used my development branch here https://github.com/FlorentRevest/linux/commits/fprobe-min-args As it stands, the performance impact of the fprobe based implementation would be too high for us. I wonder how much Mark's idea here https://git.kernel.org/pub/scm/linux/kernel/git/mark/linux.git/log/?h=arm64/ftrace/per-callsite-ops would help but it doesn't work right now.