Re: [PATCH v1 bpf] selftests/bpf: Add exponential backoff to map_update_retriable in test_maps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Aug 16, 2021 at 4:28 PM Song Liu <song@xxxxxxxxxx> wrote:
>
> On Mon, Aug 16, 2021 at 10:54 AM Yucong Sun <fallentree@xxxxxx> wrote:
> >
> > Using a fixed delay of 1ms is proven flaky in slow CPU environment, eg.  github
> > action CI system. This patch adds exponential backoff with a cap of 50ms, to
> > reduce the flakiness of the test.
>
> Do we have data showing how flaky the test is before and after this change?

Before the change, on 2 CPU KVM on my laptop the test is perfectly
fine, on Github action (2 emulated CPU) , it appeared to fail roughly
1 in 10 runs or even more frequently.
After the change, it appears pretty robust both on my laptop and on
github action, I ran the github action a couple times and it succeeded
every time.

>
> >
> > Signed-off-by: Yucong Sun <fallentree@xxxxxx>
> > ---
> >  tools/testing/selftests/bpf/test_maps.c | 7 ++++++-
> >  1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/tools/testing/selftests/bpf/test_maps.c b/tools/testing/selftests/bpf/test_maps.c
> > index 14cea869235b..ed92d56c19cf 100644
> > --- a/tools/testing/selftests/bpf/test_maps.c
> > +++ b/tools/testing/selftests/bpf/test_maps.c
> > @@ -1400,11 +1400,16 @@ static void test_map_stress(void)
> >  static int map_update_retriable(int map_fd, const void *key, const void *value,
> >                                 int flags, int attempts)
> >  {
> > +       int delay = 1;
> > +
> >         while (bpf_map_update_elem(map_fd, key, value, flags)) {
> >                 if (!attempts || (errno != EAGAIN && errno != EBUSY))
> >                         return -errno;
> >
> > -               usleep(1);
> > +               if (delay < 50)
> > +                       delay *= 2;
> > +
> > +               usleep(delay);
>
> It is a little weird that the delay times in microseconds are 2, 4, 8,
> 16, 32, 64, 64, ...
> Maybe just use rand()?

map_update_retriable is called by test_map_update() , which is being
parallel executed in 1024 threads, so the lock contention is
intentional, I think if we introduce randomness in the delay it kind
of defeats the purpose of the test.
My original proposal is to just increase the attempts to 10X , Andrii
proposed to use an exponential back-off, which is what I ended up
implementing.

>
> Thanks,
> Song
>
> >                 attempts--;
> >         }
> >
> > --
> > 2.30.2
> >



[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux