Re: [PATCH v1 1/1] Firstly, as Andy mentioned, this should be smp_rmb() instead of rmb(). considering that TSYNC is a cross-thread situation, and rmb() is a mandatory barrier which should not be used to control SMP effects, since mandatory barriers impose unnecessary overhead on both SMP and UP systems, as kernel Documentation said.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Feb 02, 2021 at 06:13:07PM +0800, wanghongzhe wrote:
> Secondly, the smp_rmb() should be put between reading SYSCALL_WORK_SECCOMP and reading
> seccomp.mode, not between reading seccomp.mode and seccomp->filter, to make
> sure that any changes to mode from another thread have been seen after
> SYSCALL_WORK_SECCOMP was seen, as the original comment shown. This issue seems to be
> misintroduced at 13aa72f0fd0a9f98a41cefb662487269e2f1ad65 which aims to
> refactor the filter callback and the API. So the intuitive solution is to put
> it back like:
> 
> Thirdly, however, we can go further to improve the performace of checking
> syscall, considering that smp_rmb is always executed on the syscall-check
> path at each time for both FILTER and STRICT check while the TSYNC case
> which may lead to race condition is just a rare situation, and that in
> some arch like Arm64 smp_rmb is dsb(ishld) not a cheap barrier() in x86-64.
> 
> As a result, smp_rmb() should only be executed when necessary, e.g, it is
> only necessary when current thread's mode is SECCOMP_MODE_DISABLED at the
> first TYSNCed time, because after that the current thread's mode will always
> be SECCOMP_MODE_FILTER (and SYSCALL_WORK_SECCOMP will always be set) and can not be
> changed anymore by anyone. In other words, after that, any thread can not
> change the mode (and SYSCALL_WORK_SECCOMP), so the race condition disappeared, and
> no more smb_rmb() needed ever.
> 
> So the solution is to read mode again behind smp_rmb() after the mode is seen
> as SECCOMP_MODE_DISABLED by current thread at the first TSYNCed time, and if
> the new mode don't equals to SECCOMP_MODE_FILTER, do BUG(), go to FILTER path
> otherwise.
> 
> RFC -> v1:
>  - replace rmb() with smp_rmb()
>  - move the smp_rmb() logic to the middle between SYSCALL_WORK_SECCOMP and mode
> 
> Signed-off-by: wanghongzhe <wanghongzhe@xxxxxxxxxx>
> Reviewed-by: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
> ---
>  kernel/seccomp.c | 25 +++++++++++++++++--------
>  1 file changed, 17 insertions(+), 8 deletions(-)
> 
> diff --git a/kernel/seccomp.c b/kernel/seccomp.c
> index 952dc1c90229..a621fb913ec6 100644
> --- a/kernel/seccomp.c
> +++ b/kernel/seccomp.c
> @@ -1160,12 +1160,6 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>  	int data;
>  	struct seccomp_data sd_local;
>  
> -	/*
> -	 * Make sure that any changes to mode from another thread have
> -	 * been seen after SYSCALL_WORK_SECCOMP was seen.
> -	 */
> -	rmb();
> -
>  	if (!sd) {
>  		populate_seccomp_data(&sd_local);
>  		sd = &sd_local;
> @@ -1289,7 +1283,6 @@ static int __seccomp_filter(int this_syscall, const struct seccomp_data *sd,
>  
>  int __secure_computing(const struct seccomp_data *sd)
>  {
> -	int mode = current->seccomp.mode;
>  	int this_syscall;
>  
>  	if (IS_ENABLED(CONFIG_CHECKPOINT_RESTORE) &&
> @@ -1299,10 +1292,26 @@ int __secure_computing(const struct seccomp_data *sd)
>  	this_syscall = sd ? sd->nr :
>  		syscall_get_nr(current, current_pt_regs());
>  
> -	switch (mode) {
> +	/*
> +	 * Make sure that any changes to mode from another thread have
> +	 * been seen after SYSCALL_WORK_SECCOMP was seen.
> +	 */
> +	smp_rmb();

Let's start with a patch that just replaces rmb() with smp_rmb() and
then work on optimizing. Can you provide performance numbers that show
rmb() (and soon smp_rmb()) is causing actual problems here?

> +
> +	switch (current->seccomp.mode) {
>  	case SECCOMP_MODE_STRICT:
>  		__secure_computing_strict(this_syscall);  /* may call do_exit */
>  		return 0;
> +	/*
> +	 * Make sure that change to mode (from SECCOMP_MODE_DISABLED to
> +	 * SECCOMP_MODE_FILTER) from another thread using TSYNC ability
> +	 * have been seen after SYSCALL_WORK_SECCOMP was seen. Read mode again behind
> +	 * smp_rmb(), if it equals SECCOMP_MODE_FILTER, go to the right path.
> +	 */
> +	case SECCOMP_MODE_DISABLED:
> +		smp_rmb();
> +		if (unlikely(current->seccomp.mode != SECCOMP_MODE_FILTER))
> +			BUG();

BUG() should never be used[1]. This is a recoverable situation, I think, and
should be handled as such.

-Kees

[1] https://www.kernel.org/doc/html/latest/process/deprecated.html#bug-and-bug-on

>  	case SECCOMP_MODE_FILTER:
>  		return __seccomp_filter(this_syscall, sd, false);
>  	default:
> -- 
> 2.19.1
> 

-- 
Kees Cook



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux