On Thu, Nov 7, 2024 at 9:08 PM Eduard Zingerman <eddyz87@xxxxxxxxx> wrote: > > On Fri, 2024-11-01 at 17:32 -0700, Eduard Zingerman wrote: > > On Fri, 2024-11-01 at 17:29 -0700, Alexei Starovoitov wrote: > > > > [...] > > > > > Hmm. > > > Puranjay touched it last with extra logic. > > > > > > And before that David Vernet tried to address flakiness > > > in commit 4a54de65964d. > > > Yonghong also noticed lockups in paravirt > > > and added workaround 7015843afc. > > > > > > Your additional timeout/workaround makes sense to me, > > > but would be good to bisect whether Puranjay's change caused it. > > > > I'll debug what's going on some time later today or on Sat. > > I finally had time to investigate this a bit. > First, here is how to trigger lockup: > > t1=send_signal/send_signal_perf_thread_remote; \ > t2=send_signal/send_signal_nmi_thread_remote; \ > for i in $(seq 1 100); do ./test_progs -t $t1,$t2; done > > Must be both tests for whatever reason. > The failing test is 'send_signal_nmi_thread_remote'. > > The test is organized as parent and child processes communicating > various events to each other. The intended sequence of events: > - child: > - install SIGUSR1 handler > - notify parent > - wait for parent > - parent: > - open PERF_COUNT_SW_CPU_CLOCK event > - attach BPF program to the event > - notify child > - enter busy loop for 10^8 iterations > - wait for child > - BPF program: > - send SIGUSR1 to child > - child: > - poll for SIGUSR1 in a busy loop > - notify parent > - parent: > - check value communicated by child, > terminate test. > > The lockup happens because on every other test run perf event is not > triggered, child does not receive SIGUSR1 and thus both parent and > child are stuck. > > For 'send_signal_nmi_thread_remote' perf event is defined as: > > struct perf_event_attr attr = { > .sample_period = 1, > .type = PERF_TYPE_HARDWARE, > .config = PERF_COUNT_HW_CPU_CYCLES, > }; > > And is opened for parent process pid. > > Apparently, the perf event is not always triggered between lines > send_signal.c:165-180. And at line 180 parent enters system call, > so cpu cycles stop ticking for 'parent', thus if perf event > had not been triggered already it won't be triggered at all > (as far as I understand). > > Applying same fix as Yonghong did in 7015843afc is sufficient to > reliably trigger perf event: > > --- a/tools/testing/selftests/bpf/prog_tests/send_signal.c > +++ b/tools/testing/selftests/bpf/prog_tests/send_signal.c > @@ -223,7 +223,8 @@ static void test_send_signal_perf(bool signal_thread, bool remote) > static void test_send_signal_nmi(bool signal_thread, bool remote) > { > struct perf_event_attr attr = { > - .sample_period = 1, > + .freq = 1, > + .sample_freq = 1000, > .type = PERF_TYPE_HARDWARE, > .config = PERF_COUNT_HW_CPU_CYCLES, > }; > > But I don't understand why. > As far as I can figure from kernel source code, > sample_period is measured in nanoseconds (is it?), I believe sample_period is a number of samples. 1 means that perf suppose to generate event very often. It means nanoseconds only for SW cpu_cycles. let's apply above workaround and move on. Pls send a patch.