On 5/1/24 3:15 PM, Kui-Feng Lee wrote:
On 5/1/24 11:48, Martin KaFai Lau wrote:
On 4/29/24 2:36 PM, Kui-Feng Lee wrote:
+/* Called from the subsystem that consume the struct_ops.
+ *
+ * The caller should protected this function by holding rcu_read_lock() to
+ * ensure "data" is valid. However, this function may unlock rcu
+ * temporarily. The caller should not rely on the preceding rcu_read_lock()
+ * after returning from this function.
This temporarily losing rcu_read_lock protection is error prone. The caller
should do the inc_not_zero() instead if it is needed.
I feel the approach in patch 1 and 3 is a little box-ed in by the earlier
tcp-cc usage that tried to fit into the kernel module reg/unreg paradigm and
hide as much bpf details as possible from tcp-cc. This is not necessarily true
now for other subsystem which has bpf struct_ops from day one.
The epoll detach notification is link only. Can this kernel side specific
unreg be limited to struct_ops link only? During reg, a rcu protected link
could be passed to the subsystem. That subsystem becomes a kernel user of the
bpf link and it can call link_detach(link) to detach. Pseudo code:
struct link __rcu *link;
rcu_read_lock();
ref_link = rcu_dereference(link)
if (ref_link)
ref_link = bpf_link_inc_not_zero(ref_link);
rcu_read_unlock();
if (!IS_ERR_OR_NULL(ref_link)) {
bpf_struct_ops_map_link_detach(ref_link);
bpf_link_put(ref_link);
}
Since not every struct_ops map has a link, we need a callback in additional
to ops->reg to register links with subsystems. If the callback is
ops->reg_link, struct_ops will call ops->reg_link if a subsystem provide
it and the map is registered through a link, or it should call ops->reg.
I would just add a link pointer arg to the existing reg(). The same probably
needs to be done for unreg(). Pass a NULL as the link if it does not have one
during reg(). If the subsystem chooses to enforce link only struct_ops, it can
reject if link is not provided during reg().