On Thu, 4 Apr 2024 13:58:43 +0200 Jiri Olsa <olsajiri@xxxxxxxxx> wrote: > On Wed, Apr 03, 2024 at 07:00:07PM -0700, Andrii Nakryiko wrote: > > SNIP > > > Check rt_sigreturn syscall (manpage at [0], for example). > > > > sigreturn() exists only to allow the implementation of signal > > handlers. It should never be called directly. (Indeed, a simple > > sigreturn() wrapper in the GNU C library simply returns -1, with > > errno set to ENOSYS.) Details of the arguments (if any) passed > > to sigreturn() vary depending on the architecture. (On some > > architectures, such as x86-64, sigreturn() takes no arguments, > > since all of the information that it requires is available in the > > stack frame that was previously created by the kernel on the > > user-space stack.) > > > > This is a very similar use case. Also, check its source code in > > arch/x86/kernel/signal_64.c. It sends SIGSEGV to the calling process > > on any sign of something not being right. It's exactly the same with > > sys_uretprobe. > > > > [0] https://man7.org/linux/man-pages/man2/sigreturn.2.html > > > > > And the number of syscalls are limited resource. > > > > We have almost 500 of them, it didn't seems like adding 1-2 for good > > reasons would be a problem. Can you please point to where the limits > > on syscalls as a resource are described? I'm curious to learn. > > > > > > > > I'm actually not sure how much we need to care of it, but adding a new > > > syscall is worth to be discussed carefully because all of them are > > > user-space compatibility. > > > > Absolutely, it's a good discussion to have. > > > > > > > > > > > > Also, we should run syzkaller on this syscall. And if uretprobe is > > > > > > > > > > > > right, I'll check on syzkaller > > > > > > > > > > > > > set in the user function, what happen if the user function directly > > > > > > > calls this syscall? (maybe it consumes shadow stack?) > > > > > > > > > > > > the process should receive SIGILL if there's no pending uretprobe for > > > > > > the current task, or it will trigger uretprobe if there's one pending > > > > > > > > > > No, that is too aggressive and not safe. Since the syscall is exposed to > > > > > user program, it should return appropriate error code instead of SIGILL. > > > > > > > > > > > > > This is the way it is today with uretprobes even through interrupt. > > > > > > I doubt that the interrupt (exception) and syscall should be handled > > > differently. Especially, this exception is injected by uprobes but > > > syscall will be caused by itself. But syscall can be called from user > > > program (of couse this works as sys_kill(self, SIGILL)). > > > > Yep, I'd keep the behavior the same between uretprobes implemented > > through int3 and sys_uretprobe. > > +1 > > > > > > > > > > E.g., it could happen that user process is using fibers and is > > > > replacing stack pointer without kernel realizing this, which will > > > > trigger some defensive checks in uretprobe handling code and kernel > > > > will send SIGILL because it can't support such cases. This is > > > > happening today already, and it works fine in practice (except for > > > > applications that manually change stack pointer, too bad, you can't > > > > trace them with uretprobes, unfortunately). > > > > > > OK, we at least need to document it. > > > > +1, yep > > > > > > > > > > > > > So I think it's absolutely adequate to have this behavior if the user > > > > process is *intentionally* abusing this API. > > > > > > Of course user expected that it is abusing. So at least we need to > > > add a document that this syscall number is reserved to uprobes and > > > user program must not use it. > > > > > > > Totally agree about documenting this. > > ok there's map page on sigreturn.. do you think we should add man page > for uretprobe or you can think of some other place to document it? I think it is better to have a man-page. Anyway, to discuss and explain this syscall, the man-page is a good format to describe it. > > > > > > > > > > > > > > > > > > > but we could limit the syscall to be executed just from the trampoline, > > > > > > that should prevent all the user space use cases, I'll do that in next > > > > > > version and add more tests for that > > > > > > > > > > Why not limit? :) The uprobe_handle_trampoline() expects it is called > > > > > only from the trampoline, so it is natural to check the caller address. > > > > > (and uprobe should know where is the trampoline) > > > > > > > > > > Since the syscall is always exposed to the user program, it should > > > > > - Do nothing and return an error unless it is properly called. > > > > > - check the prerequisites for operation strictly. > > > > > I concern that new system calls introduce vulnerabilities. > > > > > > > > > > > > > As Oleg and Jiri mentioned, this syscall can't harm kernel or other > > > > processes, only the process that is abusing the API. So any extra > > > > checks that would slow down this approach is an unnecessary overhead > > > > and complication that will never be useful in practice. > > > > > > I think at least it should check the caller address to ensure the > > > address is in the trampoline. > > > But anyway, uprobes itself can break the target process, so no one > > > might care if this system call breaks the process now. > > > > If we already have an expected range of addresses, then I think it's > > fine to do a quick unlikely() check. I'd be more concerned if we need > > to do another lookup or search to just validate this. I'm sure Jiri > > will figure it out. > > Oleg mentioned the trampoline address check could race with another > thread's mremap call, however trap is using that check as well, it > still seems like good idea to do it also in the uretprobe syscall Yeah, and also, can we add a stack pointer check if the trampoline is shared with other probe points? Thank you, > > jirka -- Masami Hiramatsu (Google) <mhiramat@xxxxxxxxxx>