On Fri, Mar 01, 2024 at 11:39:03AM -0800, Kui-Feng Lee wrote: > > > > On 2/29/24 06:39, Jiri Olsa wrote: > > One of uprobe pain points is having slow execution that involves > > two traps in worst case scenario or single trap if the original > > instruction can be emulated. For return uprobes there's one extra > > trap on top of that. > > > > My current idea on how to make this faster is to follow the optimized > > kprobes and replace the normal uprobe trap instruction with jump to > > user space trampoline that: > > > > - executes syscall to call uprobe consumers callbacks > > - executes original instructions > > - jumps back to continue with the original code > > > > There are of course corner cases where above will have trouble or > > won't work completely, like: > > > > - executing original instructions in the trampoline is tricky wrt > > rip relative addressing > > > > - some instructions we can't move to trampoline at all > > > > - the uprobe address is on page boundary so the jump instruction to > > trampoline would span across 2 pages, hence the page replace won't > > be atomic, which might cause issues > > > > - ... ? many others I'm sure > > > > Still with all the limitations I think we could be able to speed up > > some amount of the uprobes, which seems worth doing. > > Just a random idea related to this. > Could we also run jit code of bpf programs in the user space to collect > information instead of going back to the kernel every time? sorry for late reply, do you mean like ubpf? the scope of this change is to speed up the generic uprobe, ebpf is just one of the consumers jirka > These jit code should not be able to access helpers or kfuncs, but they > still can collect and aggregate data, store data in bpf maps, and change > behavior of user space programs. > > > > > I'd like to have the discussion on the topic and get some agreement > > or directions on how this should be done. > >