Re: [PATCH v8 4/7] entry: Support Syscall User Dispatch on common syscall entry

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 27, 2020 at 11:33 AM Gabriel Krisman Bertazi
<krisman@xxxxxxxxxxxxx> wrote:
>
> Syscall User Dispatch (SUD) must take precedence over seccomp and
> ptrace, since the use case is emulation (it can be invoked with a
> different ABI) such that seccomp filtering by syscall number doesn't
> make sense in the first place.  In addition, either the syscall is
> dispatched back to userspace, in which case there is no resource for to
> trace, or the syscall will be executed, and seccomp/ptrace will execute
> next.
>
> Since SUD runs before tracepoints, it needs to be a SYSCALL_WORK_EXIT as
> well, just to prevent a trace exit event when dispatch was triggered.
> For that, the on_syscall_dispatch() examines context to skip the
> tracepoint, audit and other work.
>
> Signed-off-by: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx>
> Acked-by: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
> ---
> Changes since v6:
>   - Update do_syscall_intercept signature (Christian Brauner)
>   - Move it to before tracepoints
>   - Use SYSCALL_WORK flags
> ---
>  include/linux/entry-common.h |  2 ++
>  kernel/entry/common.c        | 17 +++++++++++++++++
>  2 files changed, 19 insertions(+)
>
> diff --git a/include/linux/entry-common.h b/include/linux/entry-common.h
> index 49b26b216e4e..a6e98b4ba8e9 100644
> --- a/include/linux/entry-common.h
> +++ b/include/linux/entry-common.h
> @@ -44,10 +44,12 @@
>                                  SYSCALL_WORK_SYSCALL_TRACE |           \
>                                  SYSCALL_WORK_SYSCALL_EMU |             \
>                                  SYSCALL_WORK_SYSCALL_AUDIT |           \
> +                                SYSCALL_WORK_SYSCALL_USER_DISPATCH |   \
>                                  ARCH_SYSCALL_WORK_ENTER)
>  #define SYSCALL_WORK_EXIT      (SYSCALL_WORK_SYSCALL_TRACEPOINT |      \
>                                  SYSCALL_WORK_SYSCALL_TRACE |           \
>                                  SYSCALL_WORK_SYSCALL_AUDIT |           \
> +                                SYSCALL_WORK_SYSCALL_USER_DISPATCH |   \
>                                  ARCH_SYSCALL_WORK_EXIT)
>
>  /*
> diff --git a/kernel/entry/common.c b/kernel/entry/common.c
> index f1b12dc32ff4..ec20aba3b890 100644
> --- a/kernel/entry/common.c
> +++ b/kernel/entry/common.c
> @@ -6,6 +6,8 @@
>  #include <linux/livepatch.h>
>  #include <linux/audit.h>
>
> +#include "common.h"
> +
>  #define CREATE_TRACE_POINTS
>  #include <trace/events/syscalls.h>
>
> @@ -47,6 +49,16 @@ static long syscall_trace_enter(struct pt_regs *regs, long syscall,
>  {
>         long ret = 0;
>
> +       /*
> +        * Handle Syscall User Dispatch.  This must comes first, since
> +        * the ABI here can be something that doesn't make sense for
> +        * other syscall_work features.
> +        */
> +       if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
> +               if (do_syscall_user_dispatch(regs))
> +                       return -1L;
> +       }
> +
>         /* Handle ptrace */
>         if (work & (SYSCALL_WORK_SYSCALL_TRACE | SYSCALL_WORK_SYSCALL_EMU)) {
>                 ret = arch_syscall_enter_tracehook(regs);
> @@ -232,6 +244,11 @@ static void syscall_exit_work(struct pt_regs *regs, unsigned long work)
>  {
>         bool step;
>
> +       if (work & SYSCALL_WORK_SYSCALL_USER_DISPATCH) {
> +               if (on_syscall_dispatch())
> +                       return;
> +       }

I think this would be less confusing if you just open-coded the body
of on_syscall_dispatch here and got rid of the helper.

--Andy



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux