On Thu, Nov 14, 2024 at 03:44:12PM -0800, Andrii Nakryiko wrote: > On Tue, Nov 5, 2024 at 8:33 AM Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > > > > On Tue, Nov 05, 2024 at 03:23:27PM +0100, Peter Zijlstra wrote: > > > On Tue, Nov 05, 2024 at 02:33:59PM +0100, Jiri Olsa wrote: > > > > Adding interface to add special mapping for user space page that will be > > > > used as place holder for uprobe trampoline in following changes. > > > > > > > > The get_tramp_area(vaddr) function either finds 'callable' page or create > > > > new one. The 'callable' means it's reachable by call instruction (from > > > > vaddr argument) and is decided by each arch via new arch_uprobe_is_callable > > > > function. > > > > > > > > The put_tramp_area function either drops refcount or destroys the special > > > > mapping and all the maps are clean up when the process goes down. > > > > > > In another thread somewhere, Andrii mentioned that Meta has executables > > > with more than 4G of .text. This isn't going to work for them, is it? > > > > > > > not if you can't reach the trampoline from the probed address > > That specific example was about 1.5GB (though we might have bigger > .text, I didn't do exhaustive research). As Jiri said, this would be > best effort trying to find closest free mapping to stay within +/-2GB > offset. If that fails, we always would be falling back to slower > int3-based uprobing, yep. > > Jiri, we could also have an option to support 64-bit call, right? We'd > need nop9 for that, but it's an option as well to future-proofing this > approach, no? hm, I don't think there's call with relative 64bit offset there's indirect call through register or address.. but I think we would fit in nop10 with the indirect call through address > > Also, can we somehow use fs/gs-based indirect calls/jumps somehow to > have a guarantee that offset is always small (<2GB away relative to > the base stored in fs/gs). Not sure if this is feasible, but I thought > it would be good to bring this up just to make sure it doesn't work. > > If segment based absolute call is somehow feasible, we can probably > simplify a bunch of stuff by allocating it eagerly, once, and > somewhere high up next to VDSO (or maybe even put it into VDSO, don't > now). yes, that would be convenient jirka > > Anyways, let's brainstorm if there are any clever alternatives here. > > > > > > jirka