On 11/12/2024 5:30 AM, Martin KaFai Lau wrote:
On 11/9/24 12:40 AM, Xu Kuohai wrote:
On 11/9/2024 3:39 AM, Martin KaFai Lau wrote:
On 11/8/24 12:26 AM, Xu Kuohai wrote:
-static void bpf_testmod_test_2(int a, int b)
+static void bpf_dummy_unreg(void *kdata, struct bpf_link *link)
{
+ WRITE_ONCE(__bpf_dummy_ops, &__bpf_testmod_ops);
}
[ ... ]
+static int run_struct_ops(const char *val, const struct kernel_param *kp)
+{
+ int ret;
+ unsigned int repeat;
+ struct bpf_testmod_ops *ops;
+
+ ret = kstrtouint(val, 10, &repeat);
+ if (ret)
+ return ret;
+
+ if (repeat > 10000)
+ return -ERANGE;
+
+ while (repeat-- > 0) {
+ ops = READ_ONCE(__bpf_dummy_ops);
I don't think it is the usual bpf_struct_ops implementation which only uses READ_ONCE and WRITE_ONCE to protect the registered ops. tcp-cc uses a refcnt+rcu. It seems hid uses synchronize_srcu(). sched_ext seems to also use kthread_flush_work() to wait
for all ops calling finished. Meaning I don't think the current bpf_struct_ops unreg implementation will run into this issue for sleepable ops.
Thanks for the explanation.
Are you saying that it's not the struct_ops framework's
responsibility to ensure the struct_ops map is not
released while it may be still in use? And the "bug" in
this series should be "fixed" in the test, namely this
patch?
Yeah, it is what I was trying to say. I don't think there is thing to fix. Think about extending a subsystem by a kernel module. The subsystem will also do the needed protection itself during the unreg process. There is already a bpf_try_module_get() to
help the subsystem.
Got it
The current synchronize_rcu_mult(call_rcu, call_rcu_tasks) is only needed for the tcp-cc because a tcp-cc's ops (which uses refcnt+rcu) can decrement its own refcnt. Looking back, this was a mistake (mine). A new tcp-cc ops should have been introduced
instead to return a new tcp-cc-ops to be used.
Not quite clear, but from the description, it seems that
the synchronize_rcu_mult(call_rcu, call_rcu_tasks) could
This synchronize_rcu_mult is only need for the tcp_congestion_ops (bpf_tcp_ca.c). May be it is cleaner to just make a special case for "tcp_congestion_ops" in st_ops->name in map_alloc and only set free_after_mult_rcu_gp to TRUE for this one case, then it
won't slow down other struct_ops map freeing also.
OK, will git it a try, thanks.
imo, the test in this patch is not needed in its current form also since it is not how the kernel subsystem implements unreg in struct_ops.
be just removed in some way, no need to do a cleanup to
switch it to call_rcu.
+ if (ops->test_1)
+ ops->test_1();
+ if (ops->test_2)
+ ops->test_2(0, 0);
+ }
+
+ return 0;
+}