Re: [PATCH v3 bpf-next] Improve BPF test stability (related to perf events and scheduling)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 3/8/22 9:29 AM, Mykola Lysenko wrote:
Thanks Yonghong,

Sorry for the delay in here.

I have split commits into 3 as you asked. Will send it out shortly. Have few questions below re: find_vma test.

On Mar 3, 2022, at 12:31 PM, Yonghong Song <yhs@xxxxxx> wrote:



On 3/3/22 9:29 AM, Mykola Lysenko wrote:
On Mar 3, 2022, at 7:36 AM, Daniel Borkmann <daniel@xxxxxxxxxxxxx> wrote:

On 3/2/22 10:27 PM, Mykola Lysenko wrote:
In send_signal, replace sleep with dummy cpu intensive computation
to increase probability of child process being scheduled. Add few
more asserts.
In find_vma, reduce sample_freq as higher values may be rejected in
some qemu setups, remove usleep and increase length of cpu intensive
computation.
In bpf_cookie, perf_link and perf_branches, reduce sample_freq as
higher values may be rejected in some qemu setups
Signed-off-by: Mykola Lysenko <mykolal@xxxxxx>
Acked-by: Yonghong Song <yhs@xxxxxx>
---
  .../selftests/bpf/prog_tests/bpf_cookie.c       |  2 +-
  .../testing/selftests/bpf/prog_tests/find_vma.c | 13 ++++++++++---
  .../selftests/bpf/prog_tests/perf_branches.c    |  4 ++--
  .../selftests/bpf/prog_tests/perf_link.c        |  2 +-
  .../selftests/bpf/prog_tests/send_signal.c      | 17 ++++++++++-------
  .../selftests/bpf/progs/test_send_signal_kern.c |  2 +-
  6 files changed, 25 insertions(+), 15 deletions(-)
diff --git a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
index cd10df6cd0fc..0612e79a9281 100644
--- a/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
+++ b/tools/testing/selftests/bpf/prog_tests/bpf_cookie.c
@@ -199,7 +199,7 @@ static void pe_subtest(struct test_bpf_cookie *skel)
  	attr.type = PERF_TYPE_SOFTWARE;
  	attr.config = PERF_COUNT_SW_CPU_CLOCK;
  	attr.freq = 1;
-	attr.sample_freq = 4000;
+	attr.sample_freq = 1000;
  	pfd = syscall(__NR_perf_event_open, &attr, -1, 0, -1, PERF_FLAG_FD_CLOEXEC);
  	if (!ASSERT_GE(pfd, 0, "perf_fd"))
  		goto cleanup;
diff --git a/tools/testing/selftests/bpf/prog_tests/find_vma.c b/tools/testing/selftests/bpf/prog_tests/find_vma.c
index b74b3c0c555a..7cf4feb6464c 100644
--- a/tools/testing/selftests/bpf/prog_tests/find_vma.c
+++ b/tools/testing/selftests/bpf/prog_tests/find_vma.c
@@ -30,12 +30,20 @@ static int open_pe(void)
  	attr.type = PERF_TYPE_HARDWARE;
  	attr.config = PERF_COUNT_HW_CPU_CYCLES;
  	attr.freq = 1;
-	attr.sample_freq = 4000;
+	attr.sample_freq = 1000;
  	pfd = syscall(__NR_perf_event_open, &attr, 0, -1, -1, PERF_FLAG_FD_CLOEXEC);
    	return pfd >= 0 ? pfd : -errno;
  }
  +static bool find_vma_pe_condition(struct find_vma *skel)
+{
+	return skel->bss->found_vm_exec == 0 ||
+		skel->data->find_addr_ret != 0 ||

Should this not test for `skel->data->find_addr_ret == -1` ?
It seems that find_addr_ret changes value few times until it gets to 0. I added print statements when value is changed:
find_addr_ret -1 => initial value
find_addr_ret -16 => -EBUSY
find_addr_ret 0 => final value
Hence, in this case I think it is better to wait for the final value. We do have time out in the loop anyways (when “i" reaches 1bn), so test would not get stuck.

Thanks for the above information. I read the code again. I think it is more complicated than above. Let us look at the bpf program:

SEC("perf_event")
int handle_pe(void)
{
        struct task_struct *task = bpf_get_current_task_btf();
        struct callback_ctx data = {};

        if (task->pid != target_pid)
                return 0;

        find_addr_ret = bpf_find_vma(task, addr, check_vma, &data, 0);

        /* In NMI, this should return -EBUSY, as the previous call is using
         * the irq_work.
         */
        find_zero_ret = bpf_find_vma(task, 0, check_vma, &data, 0);
        return 0;
}

Assuming task->pid == target_pid,
the first bpf program call should have
    find_addr_ret = 0     /* lock irq_work */
    find_zero_ret = -EBUSY

For the second bpf program call, there are two possibilities:
   . irq_work is unlocked, so the result will find_addr_ret = 0, find_zero_ret = -EBUSY
   . or irq_work is still locked, the result will be find_addr_ret = -EBUSY, find_zero_ret = -EBUSY

the third bpf program call will be similar to the second bpf program run.

So final validation check probably should check both 0 and -EBUSY
for find_addr_ret.


Do you mean we need to add additional test in test_and_reset_skel function or in find_vma_pe_condition?

No. There is no need for an additional test.


Do we really need to do final check for skel->data->find_addr_ret in test_and_reset_skel if we already confirmed
It became 0 previously?

Good point. Yes and no.
If we did hit find_vma_pe_condition(skel) is false, then we don't need to do subsequent checking. But if we didn't, checking is still needed to ensure the error is printed out. You could refactor the code such that only if find_vma_pe_condition(skel) is false and no subsequent checking, or
unconditionally subsequent checking.
Either way is fine with me.



Leaving some time to potentially unlock the irq_work as in the original
code is still needed to prevent potential problem for the subsequent tests.

By leaving some time, do you mean to revert removal of the next line in serial_test_find_vma function?
usleep(100000); /* allow the irq_work to finish */

Yes.



I think this patch can be broke into three separate commits:
  - find_vma fix
  - send_signal fix
  - other
to make changes a little bit focused.

TL:DR change in the test that prints these values
-       for (i = 0; i < 1000000000 && find_vma_pe_condition(skel); ++i)
+       int find_addr_ret = -1;
+       printf("find_addr_ret %d\n", skel->data->find_addr_ret);
+
+       for (i = 0; i < 1000000000 && find_vma_pe_condition(skel); ++i) {
+               if (find_addr_ret != skel->data->find_addr_ret) {
+                       find_addr_ret = skel->data->find_addr_ret;
+                       printf("find_addr_ret %d\n", skel->data->find_addr_ret);
+               }
                 ++j;
+       }
+
+       printf("find_addr_ret %d\n", skel->data->find_addr_ret);

+		skel->data->find_zero_ret == -1 ||
+		strcmp(skel->bss->d_iname, "test_progs") != 0;
+}
+
Thanks,
Daniel




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux