Hi, On 8/30/2022 9:13 AM, Martin KaFai Lau wrote: > On Mon, Aug 29, 2022 at 10:27:52PM +0800, Hou Tao wrote: >> From: Hou Tao <houtao1@xxxxxxxxxx> >> >> When there are concurrent task local storage lookup operations, >> if updates on per-cpu bpf_task_storage_busy is not preemption-safe, >> some updates will be lost due to interleave, the final value of >> bpf_task_storage_busy will not be zero and bpf_task_storage_trylock() >> on specific cpu will fail forever. >> >> So add a test case to ensure the update of per-cpu bpf_task_storage_busy >> is preemption-safe. > This test took my setup 1.5 minute to run > and cannot reproduce after running the test in a loop. > > Can it be reproduced in a much shorter time ? > If not, test_maps is probably a better place to do the test. I think the answer is No. I have think about adding the test in test_maps, but the test case needs running a bpf program to check whether the value of bpf_task_storage_busy is OK, so for simplicity I add it in test_progs. If the running time is the problem, I can move it into test_maps. > I assume it can be reproduced in arm with this test? Or it can > also be reproduced in other platforms with different kconfig. > Please paste the test failure message and the platform/kconfig > to reproduce it in the commit message. On arm64 it can be reproduced probabilistically when CONFIG_PREEMPT is enabled on 2-cpus VM as show below. You can try to increase the value of nr and loop if it still can not be reproduced. test_preemption:PASS:skel_open_and_load 0 nsec test_preemption:PASS:no mem 0 nsec test_preemption:PASS:skel_attach 0 nsec test_preemption:FAIL:bpf_task_storage_get fails unexpected bpf_task_storage_get fails: actual 0 != expected 1 #174/4 task_local_storage/preemption:FAIL #174 task_local_storage:FAIL All error logs: test_preemption:PASS:skel_open_and_load 0 nsec test_preemption:PASS:no mem 0 nsec test_preemption:PASS:skel_attach 0 nsec test_preemption:FAIL:bpf_task_storage_get fails unexpected bpf_task_storage_get fails: actual 0 != expected 1 #174/4 task_local_storage/preemption:FAIL #174 task_local_storage:FAIL Summary: 0/0 PASSED, 0 SKIPPED, 1 FAILED On x86-64 __this_cpu_{inc|dec} are atomic, so it is not possible to reproduce the problem.