>> On Feb 1, 2021, at 4:06 AM, wanghongzhe <wanghongzhe@xxxxxxxxxx> wrote: >> >> If a thread(A)'s TSYNC flag is set from seccomp(), then it will >> synchronize its seccomp filter to other threads(B) in same thread >> group. To avoid race condition, seccomp puts rmb() between reading the >> mode and filter in seccomp check patch(in B thread). >> As a result, every syscall's seccomp check is slowed down by the >> memory barrier. >> >> However, we can optimize it by calling rmb() only when filter is NULL >> and reading it again after the barrier, which means the rmb() is >> called only once in thread lifetime. >> >> The 'filter is NULL' conditon means that it is the first time >> attaching filter and is by other thread(A) using TSYNC flag. >> In this case, thread B may read the filter first and mode later in CPU >> out-of-order exection. After this time, the thread B's mode is always >> be set, and there will no race condition with the filter/bitmap. >> >> In addtion, we should puts a write memory barrier between writing the >> filter and mode in smp_mb__before_atomic(), to avoid the race >> condition in TSYNC case. > > I haven’t fully worked this out, but rmb() is bogus. This should be smp_rmb(). Yes, I think you are right.I will fix it and send another patch. >> >> Signed-off-by: wanghongzhe <wanghongzhe@xxxxxxxxxx> >> --- >> kernel/seccomp.c | 31 ++++++++++++++++++++++--------- >> 1 file changed, 22 insertions(+), 9 deletions(-) >> >> diff --git a/kernel/seccomp.c b/kernel/seccomp.c index >> 952dc1c90229..b944cb2b6b94 100644 >> --- a/kernel/seccomp.c >> +++ b/kernel/seccomp.c >> @@ -397,8 +397,20 @@ static u32 seccomp_run_filters(const struct seccomp_data *sd, >> READ_ONCE(current->seccomp.filter); >> >> /* Ensure unexpected behavior doesn't result in failing open. */ >> - if (WARN_ON(f == NULL)) >> - return SECCOMP_RET_KILL_PROCESS; >> + if (WARN_ON(f == NULL)) { >> + /* >> + * Make sure the first filter addtion (from another >> + * thread using TSYNC flag) are seen. >> + */ >> + rmb(); >> + >> + /* Read again */ >> + f = READ_ONCE(current->seccomp.filter); >> + >> + /* Ensure unexpected behavior doesn't result in failing open. */ >> + if (WARN_ON(f == NULL)) >> + return SECCOMP_RET_KILL_PROCESS; >> + } >> >> if (seccomp_cache_check_allow(f, sd)) >> return SECCOMP_RET_ALLOW; >> @@ -614,9 +626,16 @@ static inline void seccomp_sync_threads(unsigned long flags) >> * equivalent (see ptrace_may_access), it is safe to >> * allow one thread to transition the other. >> */ >> - if (thread->seccomp.mode == SECCOMP_MODE_DISABLED) >> + if (thread->seccomp.mode == SECCOMP_MODE_DISABLED) { >> + /* >> + * Make sure mode cannot be set before the filter >> + * are set. >> + */ >> + smp_mb__before_atomic(); >> + >> seccomp_assign_mode(thread, SECCOMP_MODE_FILTER, >> flags); >> + } >> } >> } >> >> @@ -1160,12 +1179,6 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd, >> int data; >> struct seccomp_data sd_local; >> >> - /* >> - * Make sure that any changes to mode from another thread have >> - * been seen after SYSCALL_WORK_SECCOMP was seen. >> - */ >> - rmb(); >> - >> if (!sd) { >> populate_seccomp_data(&sd_local); >> sd = &sd_local; >> -- >> 2.19.1 >>