On 11/9/24 12:40 AM, Xu Kuohai wrote:
On 11/9/2024 3:39 AM, Martin KaFai Lau wrote:
On 11/8/24 12:26 AM, Xu Kuohai wrote:
-static void bpf_testmod_test_2(int a, int b)
+static void bpf_dummy_unreg(void *kdata, struct bpf_link *link)
{
+ WRITE_ONCE(__bpf_dummy_ops, &__bpf_testmod_ops);
}
[ ... ]
+static int run_struct_ops(const char *val, const struct kernel_param *kp)
+{
+ int ret;
+ unsigned int repeat;
+ struct bpf_testmod_ops *ops;
+
+ ret = kstrtouint(val, 10, &repeat);
+ if (ret)
+ return ret;
+
+ if (repeat > 10000)
+ return -ERANGE;
+
+ while (repeat-- > 0) {
+ ops = READ_ONCE(__bpf_dummy_ops);
I don't think it is the usual bpf_struct_ops implementation which only uses
READ_ONCE and WRITE_ONCE to protect the registered ops. tcp-cc uses a
refcnt+rcu. It seems hid uses synchronize_srcu(). sched_ext seems to also use
kthread_flush_work() to wait for all ops calling finished. Meaning I don't
think the current bpf_struct_ops unreg implementation will run into this issue
for sleepable ops.
Thanks for the explanation.
Are you saying that it's not the struct_ops framework's
responsibility to ensure the struct_ops map is not
released while it may be still in use? And the "bug" in
this series should be "fixed" in the test, namely this
patch?
Yeah, it is what I was trying to say. I don't think there is thing to fix. Think
about extending a subsystem by a kernel module. The subsystem will also do the
needed protection itself during the unreg process. There is already a
bpf_try_module_get() to help the subsystem.
The current synchronize_rcu_mult(call_rcu, call_rcu_tasks) is only needed for
the tcp-cc because a tcp-cc's ops (which uses refcnt+rcu) can decrement its
own refcnt. Looking back, this was a mistake (mine). A new tcp-cc ops should
have been introduced instead to return a new tcp-cc-ops to be used.
Not quite clear, but from the description, it seems that
the synchronize_rcu_mult(call_rcu, call_rcu_tasks) could
This synchronize_rcu_mult is only need for the tcp_congestion_ops
(bpf_tcp_ca.c). May be it is cleaner to just make a special case for
"tcp_congestion_ops" in st_ops->name in map_alloc and only set
free_after_mult_rcu_gp to TRUE for this one case, then it won't slow down other
struct_ops map freeing also.
imo, the test in this patch is not needed in its current form also since it is
not how the kernel subsystem implements unreg in struct_ops.
be just removed in some way, no need to do a cleanup to
switch it to call_rcu.
+ if (ops->test_1)
+ ops->test_1();
+ if (ops->test_2)
+ ops->test_2(0, 0);
+ }
+
+ return 0;
+}