Re: [PATCH bpf-next v4 3/3] selftests/bpf: Fix dangling stdout seen by traffic monitor thread

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 4, 2025 at 4:12 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote:
>
> On Tue, 2025-03-04 at 08:36 -0800, Amery Hung wrote:
> > Traffic monitor thread may see dangling stdout as the main thread closes
> > and reassigns stdout without protection. This happens when the main thread
> > finishes one subtest and moves to another one in the same netns_new()
> > scope.
> > The issue can be reproduced by running test_progs repeatedly with traffic
> > monitor enabled:
> >
> > for ((i=1;i<=100;i++)); do
> >    ./test_progs -a flow_dissector_skb* -m '*'
> > done
> >
> > Fix it by first consolidating stdout assignment into stdio_restore().
> > stdout will be restored to env.stdout_saved when a test ends or running
> > in the crash handler and to test_state.stdout_saved otherwise.
> > Then, protect use/close/reassignment of stdout with a lock. The locking
> > in the main thread is always performed regradless of whether traffic
> > monitor is running or not for simplicity. It won't have any side-effect.
> > stdio_restore() is kept in the crash handler instead of making all print
> > functions in the crash handler use env.stdout_saved to make it less
> > error-prone.
> >
> > Signed-off-by: Amery Hung <ameryhung@xxxxxxxxx>
> > ---
>
> This patch fixes the error for me.
>
> Tested-by: Eduard Zingerman <eddyz87@xxxxxxxxx>
>
> >  tools/testing/selftests/bpf/test_progs.c | 59 ++++++++++++++++--------
> >  1 file changed, 39 insertions(+), 20 deletions(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
> > index ab0f2fed3c58..5b89f6ca5a0a 100644
> > --- a/tools/testing/selftests/bpf/test_progs.c
> > +++ b/tools/testing/selftests/bpf/test_progs.c
> > @@ -88,7 +88,11 @@ static void stdio_hijack(char **log_buf, size_t *log_cnt)
> >  #endif
> >  }
> >
> > -static void stdio_restore_cleanup(void)
> > +static pthread_mutex_t stdout_lock = PTHREAD_MUTEX_INITIALIZER;
> > +
> > +static bool in_crash_handler(void);
> > +
> > +static void stdio_restore(void)
> >  {
> >  #ifdef __GLIBC__
> >       if (verbose() && env.worker_id == -1) {
> > @@ -98,34 +102,34 @@ static void stdio_restore_cleanup(void)
> >
> >       fflush(stdout);
> >
> > -     if (env.subtest_state) {
> > +     pthread_mutex_lock(&stdout_lock);
> > +
> > +     if (!env.subtest_state || in_crash_handler()) {
> > +             if (stdout == env.stdout_saved)
> > +                     goto out;
> > +
> > +             fclose(env.test_state->stdout_saved);
> > +             env.test_state->stdout_saved = NULL;
> > +             stdout = env.stdout_saved;
> > +             stderr = env.stderr_saved;
> > +     } else {
> >               fclose(env.subtest_state->stdout_saved);
> >               env.subtest_state->stdout_saved = NULL;
> >               stdout = env.test_state->stdout_saved;
> >               stderr = env.test_state->stdout_saved;
> > -     } else {
> > -             fclose(env.test_state->stdout_saved);
> > -             env.test_state->stdout_saved = NULL;
> >       }
> > +out:
> > +     pthread_mutex_unlock(&stdout_lock);
> >  #endif
> >  }
>
> stdio_restore_cleanup() did not reset stderr/stdout when
> env.subtest_state was NULL, but this difference does not seem to
> matter, stdio_restore_cleanup() was called from:
> - test__start_subtest(), where stdio_hijack_init() would override
>   stderr/stdout anyways.
> - run_one_test(), where it is followed by call to stdio_restore().
>
> I think this change is Ok.
>
> [...]
>
> > @@ -1276,6 +1281,18 @@ void crash_handler(int signum)
> >       backtrace_symbols_fd(bt, sz, STDERR_FILENO);
> >  }
> >
> > +static bool in_crash_handler(void)
> > +{
> > +     struct sigaction sigact;
> > +
> > +     /* sa_handler will be cleared if invoked since crash_handler is
> > +      * registered with SA_RESETHAND
> > +      */
> > +     sigaction(SIGSEGV, NULL, &sigact);
> > +
> > +     return sigact.sa_handler != crash_handler;
> > +}
> > +
>
> The patch would be simpler w/o this function. I double checked
> functions called from crash_handler() and two 'fprintf(stderr, ...)'
> there are the only places where stderr/stdout is used instead of
> *_saved versions. It is already a prevalent pattern to do
> 'fprintf(env.stderr_saved, ...)' in this file.
> Or pass a flag as in v3?

While that is a common pattern, I think restoring stdio there can make
sure people don't accidentally use stdout in the crash handler and
find that it does not work occasionally.

I will remove the in_crash_handler special case handling per Martin's
suggestion.


>
> >  void hexdump(const char *prefix, const void *buf, size_t len)
> >  {
> >       for (int i = 0; i < len; i++) {
> > @@ -1957,6 +1974,8 @@ int main(int argc, char **argv)
>
> [...]
>





[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux