Thanks Yonghong, Sorry for the delay in here. I have split commits into 3 as you asked. Will send it out shortly. Have few questions below re: find_vma test. > On Mar 3, 2022, at 12:31 PM, Yonghong Song <yhs@xxxxxx> wrote: > > > > On 3/3/22 9:29 AM, Mykola Lysenko wrote: >>> On Mar 3, 2022, at 7:36 AM, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote: >>> >>> On 3/2/22 10:27 PM, Mykola Lysenko wrote: >>>> In send_signal, replace sleep with dummy cpu intensive computation >>>> to increase probability of child process being scheduled. Add few >>>> more asserts. >>>> In find_vma, reduce sample_freq as higher values may be rejected in >>>> some qemu setups, remove usleep and increase length of cpu intensive >>>> computation. >>>> In bpf_cookie, perf_link and perf_branches, reduce sample_freq as >>>> higher values may be rejected in some qemu setups >>>> Signed-off-by: Mykola Lysenko <mykolal@xxxxxx> >>>> Acked-by: Yonghong Song <yhs@xxxxxx> >>>> --- >>>> .../selftests/bpf/prog_tests/bpf_cookie.c | 2 +- >>>> .../testing/selftests/bpf/prog_tests/find_vma.c | 13 ++++++++++--- >>>> .../selftests/bpf/prog_tests/perf_branches.c | 4 ++-- >>>> .../selftests/bpf/prog_tests/perf_link.c | 2 +- >>>> .../selftests/bpf/prog_tests/send_signal.c | 17 ++++++++++------- >>>> .../selftests/bpf/progs/test_send_signal_kern.c | 2 +- >>>> 6 files changed, 25 insertions(+), 15 deletions(-) >>>> diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c >>>> index cd10df6cd0fc..0612e79a9281 100644 >>>> --- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c >>>> +++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c >>>> @@ -199,7 +199,7 @@ static void pe_subtest(struct test_bpf_cookie *skel) >>>> attr.type = PERF_TYPE_SOFTWARE; >>>> attr.config = PERF_COUNT_SW_CPU_CLOCK; >>>> attr.freq = 1; >>>> - attr.sample_freq = 4000; >>>> + attr.sample_freq = 1000; >>>> pfd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, PERF_FLAG_FD_CLOEXEC); >>>> if (!ASSERT_GE(pfd, 0, "perf_fd")) >>>> goto cleanup; >>>> diff --git a/tools/testing/selftests/bpf/prog_tests/find_vma.c b/tools/testing/selftests/bpf/prog_tests/find_vma.c >>>> index b74b3c0c555a..7cf4feb6464c 100644 >>>> --- a/tools/testing/selftests/bpf/prog_tests/find_vma.c >>>> +++ b/tools/testing/selftests/bpf/prog_tests/find_vma.c >>>> @@ -30,12 +30,20 @@ static int open_pe(void) >>>> attr.type = PERF_TYPE_HARDWARE; >>>> attr.config = PERF_COUNT_HW_CPU_CYCLES; >>>> attr.freq = 1; >>>> - attr.sample_freq = 4000; >>>> + attr.sample_freq = 1000; >>>> pfd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, PERF_FLAG_FD_CLOEXEC); >>>> return pfd >= 0 ? pfd : -errno; >>>> } >>>> +static bool find_vma_pe_condition(struct find_vma *skel) >>>> +{ >>>> + return skel->bss->found_vm_exec == 0 || >>>> + skel->data->find_addr_ret != 0 || >>> >>> Should this not test for `skel->data->find_addr_ret == -1` ? >> It seems that find_addr_ret changes value few times until it gets to 0. I added print statements when value is changed: >> find_addr_ret -1 => initial value >> find_addr_ret -16 => -EBUSY >> find_addr_ret 0 => final value >> Hence, in this case I think it is better to wait for the final value. We do have time out in the loop anyways (when “i" reaches 1bn), so test would not get stuck. > > Thanks for the above information. I read the code again. I think it is more complicated than above. Let us look at the bpf program: > > SEC("perf_event") > int handle_pe(void) > { > struct task_struct *task = bpf_get_current_task_btf(); > struct callback_ctx data = {}; > > if (task->pid != target_pid) > return 0; > > find_addr_ret = bpf_find_vma(task, addr, check_vma, &data, 0); > > /* In NMI, this should return -EBUSY, as the previous call is using > * the irq_work. > */ > find_zero_ret = bpf_find_vma(task, 0, check_vma, &data, 0); > return 0; > } > > Assuming task->pid == target_pid, > the first bpf program call should have > find_addr_ret = 0 /* lock irq_work */ > find_zero_ret = -EBUSY > > For the second bpf program call, there are two possibilities: > . irq_work is unlocked, so the result will find_addr_ret = 0, find_zero_ret = -EBUSY > . or irq_work is still locked, the result will be find_addr_ret = -EBUSY, find_zero_ret = -EBUSY > > the third bpf program call will be similar to the second bpf program run. > > So final validation check probably should check both 0 and -EBUSY > for find_addr_ret. > Do you mean we need to add additional test in test_and_reset_skel function or in find_vma_pe_condition? Do we really need to do final check for skel->data->find_addr_ret in test_and_reset_skel if we already confirmed It became 0 previously? > Leaving some time to potentially unlock the irq_work as in the original > code is still needed to prevent potential problem for the subsequent tests. By leaving some time, do you mean to revert removal of the next line in serial_test_find_vma function? usleep(100000); /* allow the irq_work to finish */ > > I think this patch can be broke into three separate commits: > - find_vma fix > - send_signal fix > - other > to make changes a little bit focused. > >> TL:DR change in the test that prints these values >> - for (i = 0; i < 1000000000 && find_vma_pe_condition(skel); ++i) >> + int find_addr_ret = -1; >> + printf("find_addr_ret %d\n", skel->data->find_addr_ret); >> + >> + for (i = 0; i < 1000000000 && find_vma_pe_condition(skel); ++i) { >> + if (find_addr_ret != skel->data->find_addr_ret) { >> + find_addr_ret = skel->data->find_addr_ret; >> + printf("find_addr_ret %d\n", skel->data->find_addr_ret); >> + } >> ++j; >> + } >> + >> + printf("find_addr_ret %d\n", skel->data->find_addr_ret); >>> >>>> + skel->data->find_zero_ret == -1 || >>>> + strcmp(skel->bss->d_iname, "test_progs") != 0; >>>> +} >>>> + >>> Thanks, >>> Daniel