On Sun, Sep 29, 2024 at 09:33:05PM -0700, Namhyung Kim wrote: > On Mon, Sep 30, 2024 at 12:24:52PM +0900, Hyeonggon Yoo wrote: > > On Mon, Sep 30, 2024 at 11:18 AM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > > > Hello Hyeonggon, > > > > > > On Sun, Sep 29, 2024 at 11:27:25PM +0900, Hyeonggon Yoo wrote: > > > > On Sun, Sep 29, 2024 at 3:13 PM Namhyung Kim <namhyung@xxxxxxxxxx> wrote: > > > > > > +SEC("raw_tp/bpf_test_finish") > > > > > > +int BPF_PROG(check_task_struct) > > > > > > +{ > > > > > > + __u64 curr = bpf_get_current_task(); > > > > > > + struct kmem_cache *s; > > > > > > + char *name; > > > > > > + > > > > > > + s = bpf_get_kmem_cache(curr); > > > > > > + if (s == NULL) { > > > > > > + found = -1; > > > > > > + return 0; > > > > > > > > > > ... it cannot find a kmem_cache for the current task. This program is > > > > > run by bpf_prog_test_run_opts() with BPF_F_TEST_RUN_ON_CPU. So I think > > > > > the curr should point a task_struct in a slab cache. > > > > > > > > > > Am I missing something? > > > > > > > > Hi Namhyung, > > > > > > > > Out of curiosity I've been investigating this issue on my machine and > > > > running some experiments. > > > > > > Thanks a lot for looking at this! > > > > > > > > > > > When the test fails, calling dump_page() for the page the task_struct > > > > belongs to, > > > > shows that the page does not have the PGTY_slab flag set which is why > > > > virt_to_slab(current) returns NULL. > > > > > > > > Does the test always fails on your environment? On my machine, the > > > > test passed sometimes but failed some times. > > > > > > I'm using vmtest.sh but it succeeded mostly. I thought I couldn't > > > reproduce it locally, but I also see the failure sometimes. I'll take a > > > deeper look. > > > > > > > > > > > Maybe sometimes the value returned by 'current' macro belongs to a > > > > slab, but sometimes it does not. > > > > But that doesn't really make sense to me as IIUC task_struct > > > > descriptors are allocated from slab. > > > > > > AFAIK the notable exception is the init_task which lives in the kernel > > > data. I'm not sure the if the test is running by PID 1. > > > > I checked that the test is running under PID 0 (swapper) when it fails and > > non-0 PID when it succeeds. This makes sense as the task_struct for PID 0 > > should be in the kernel image area, not in a slab. > > > > Phew, fortunately, it's not a bug! :) > > Thanks for the test, I've seen the same now. > > > > > Any plans on how to adjust the test program? > > I thought the test runs in a separate task. I'll think about how to > test this more reliably. Oh, I think BPF_F_TEST_RUN_ON_CPU was the problem since it requires to run the test on the given CPU (cpu0 in this case). If the cpu0 was idle, it would fail like this. I think removing the flag will run the test on the current CPU so it won't get the swapper task anymore. Thanks, Namhyung