On Mon, Oct 21 2024 at 09:46, Björn Töpel wrote: > Celeste Liu <coelacanthushex@xxxxxxxxx> writes: >> 1. syscall_enter_from_user_mode() will do two things: >> 1) the return value is only to inform whether the syscall should be skipped. >> 2) regs will be modified by filters (seccomp or ptrace and so on). >> 2. for common entry user, there is two informations: syscall number and >> the return value of syscall_enter_from_user_mode() (called is_skipped below). >> so there is three situations: >> 1) if syscall number is invalid, the syscall should not be performed, and >> we set a0 to -ENOSYS to inform userspace the syscall doesn't exist. >> 2) if syscall number is valid, is_skipped will be used: >> a) if is_skipped is -1, which means there are some filters reject this syscall, >> so the syscall should not performed. (Of course, we can use bool instead to >> get better semantic) >> b) if is_skipped != -1, which means the filters approved this syscall, >> so we invoke syscall handler with modified regs. >> >> In your design, the logical condition is not obvious. Why syscall_enter_from_user_mode() >> informed the syscall will be skipped but the syscall handler will be called >> when syscall number is invalid? The users need to think two things to get result: >> a) -1 means skip >> b) -1 < 0 in signed integer, so the skip condition is always a invalid syscall number. >> >> In may way, the users only need to think one thing: The syscall_enter_from_user_mode() >> said -1 means the syscall should not be performed, so use it as a condition of reject >> directly. They just need to combine the informations that they get from API as the >> condition of control flow. > > I'm all-in for simpler API usage! Maybe massage the > syscall_enter_from_user_mode() (or a new one), so that additional > syscall_get_nr() call is not needed? It's completely unclear to me what the actual problem is. The flow how this works on all architectures is: regs->orig_a0 = regs->a0 regs->a0 = -ENOSYS; nr = syscall_enter_from_user_mode(....); if (nr >= 0) regs->a0 = nr < MAX_SYSCALL ? syscall(nr) : -ENOSYS; If syscall_trace_enter() returns -1 to skip the syscall, then regs->a0 is unmodified, unless one of the magic operations modified it. If syscall_trace_enter() was not active (no tracer, no seccomp ...) then regs->a0 already contains -ENOSYS. So what's the exact problem? Thanks, tglx