Re: [PATCH bpf-next v4 3/3] selftests/bpf: Fix dangling stdout seen by traffic monitor thread

Eduard Zingerman <eddyz87@xxxxxxxxx> · Tue, 04 Mar 2025 16:12:20 -0800

On Tue, 2025-03-04 at 08:36 -0800, Amery Hung wrote:
> Traffic monitor thread may see dangling stdout as the main thread closes
> and reassigns stdout without protection. This happens when the main thread
> finishes one subtest and moves to another one in the same netns_new()
> scope.
> The issue can be reproduced by running test_progs repeatedly with traffic
> monitor enabled:
> 
> for ((i=1;i<=100;i++)); do
>    ./test_progs -a flow_dissector_skb* -m '*'
> done
> 
> Fix it by first consolidating stdout assignment into stdio_restore().
> stdout will be restored to env.stdout_saved when a test ends or running
> in the crash handler and to test_state.stdout_saved otherwise.
> Then, protect use/close/reassignment of stdout with a lock. The locking
> in the main thread is always performed regradless of whether traffic
> monitor is running or not for simplicity. It won't have any side-effect.
> stdio_restore() is kept in the crash handler instead of making all print
> functions in the crash handler use env.stdout_saved to make it less
> error-prone.
> 
> Signed-off-by: Amery Hung <ameryhung@xxxxxxxxx>
> ---

This patch fixes the error for me.

Tested-by: Eduard Zingerman <eddyz87@xxxxxxxxx>

>  tools/testing/selftests/bpf/test_progs.c | 59 ++++++++++++++++--------
>  1 file changed, 39 insertions(+), 20 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> index ab0f2fed3c58..5b89f6ca5a0a 100644
> --- a/tools/testing/selftests/bpf/test_progs.c
> +++ b/tools/testing/selftests/bpf/test_progs.c
> @@ -88,7 +88,11 @@ static void stdio_hijack(char **log_buf, size_t *log_cnt)
>  #endif
>  }
>  
> -static void stdio_restore_cleanup(void)
> +static pthread_mutex_t stdout_lock = PTHREAD_MUTEX_INITIALIZER;
> +
> +static bool in_crash_handler(void);
> +
> +static void stdio_restore(void)
>  {
>  #ifdef __GLIBC__
>  	if (verbose() && env.worker_id == -1) {
> @@ -98,34 +102,34 @@ static void stdio_restore_cleanup(void)
>  
>  	fflush(stdout);
>  
> -	if (env.subtest_state) {
> +	pthread_mutex_lock(&stdout_lock);
> +
> +	if (!env.subtest_state || in_crash_handler()) {
> +		if (stdout == env.stdout_saved)
> +			goto out;
> +
> +		fclose(env.test_state->stdout_saved);
> +		env.test_state->stdout_saved = NULL;
> +		stdout = env.stdout_saved;
> +		stderr = env.stderr_saved;
> +	} else {
>  		fclose(env.subtest_state->stdout_saved);
>  		env.subtest_state->stdout_saved = NULL;
>  		stdout = env.test_state->stdout_saved;
>  		stderr = env.test_state->stdout_saved;
> -	} else {
> -		fclose(env.test_state->stdout_saved);
> -		env.test_state->stdout_saved = NULL;
>  	}
> +out:
> +	pthread_mutex_unlock(&stdout_lock);
>  #endif
>  }

stdio_restore_cleanup() did not reset stderr/stdout when
env.subtest_state was NULL, but this difference does not seem to
matter, stdio_restore_cleanup() was called from:
- test__start_subtest(), where stdio_hijack_init() would override
  stderr/stdout anyways.
- run_one_test(), where it is followed by call to stdio_restore().

I think this change is Ok.

[...]

> @@ -1276,6 +1281,18 @@ void crash_handler(int signum)
>  	backtrace_symbols_fd(bt, sz, STDERR_FILENO);
>  }
>  
> +static bool in_crash_handler(void)
> +{
> +	struct sigaction sigact;
> +
> +	/* sa_handler will be cleared if invoked since crash_handler is
> +	 * registered with SA_RESETHAND
> +	 */
> +	sigaction(SIGSEGV, NULL, &sigact);
> +
> +	return sigact.sa_handler != crash_handler;
> +}
> +

The patch would be simpler w/o this function. I double checked
functions called from crash_handler() and two 'fprintf(stderr, ...)'
there are the only places where stderr/stdout is used instead of
*_saved versions. It is already a prevalent pattern to do
'fprintf(env.stderr_saved, ...)' in this file.
Or pass a flag as in v3?

>  void hexdump(const char *prefix, const void *buf, size_t len)
>  {
>  	for (int i = 0; i < len; i++) {
> @@ -1957,6 +1974,8 @@ int main(int argc, char **argv)

[...]