On Tue, Mar 5, 2024 at 9:18 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > On Fri, Mar 01, 2024 at 11:39:03AM -0800, Kui-Feng Lee wrote: > > > > > > > > On 2/29/24 06:39, Jiri Olsa wrote: > > > One of uprobe pain points is having slow execution that involves > > > two traps in worst case scenario or single trap if the original > > > instruction can be emulated. For return uprobes there's one extra > > > trap on top of that. > > > > > > My current idea on how to make this faster is to follow the optimized > > > kprobes and replace the normal uprobe trap instruction with jump to > > > user space trampoline that: > > > > > > - executes syscall to call uprobe consumers callbacks > > > - executes original instructions > > > - jumps back to continue with the original code > > > > > > There are of course corner cases where above will have trouble or > > > won't work completely, like: > > > > > > - executing original instructions in the trampoline is tricky wrt > > > rip relative addressing > > > > > > - some instructions we can't move to trampoline at all > > > > > > - the uprobe address is on page boundary so the jump instruction to > > > trampoline would span across 2 pages, hence the page replace won't > > > be atomic, which might cause issues > > > > > > - ... ? many others I'm sure > > > > > > Still with all the limitations I think we could be able to speed up > > > some amount of the uprobes, which seems worth doing. > > > > Just a random idea related to this. > > Could we also run jit code of bpf programs in the user space to collect > > information instead of going back to the kernel every time? I was thinking about a similar idea. I guess these user space BPF programs will have limited features that we can probably use them update bpf maps. For this limited scope, we still need bpf_arena. Otherwise, the user space bpf program will need to update the bpf maps with sys_bpf(), which adds the same overhead as triggering the program with a syscall. > > sorry for late reply, do you mean like ubpf? the scope of this change > is to speed up the generic uprobe, ebpf is just one of the consumers I guess this means we need a new syscall? Thanks, Song