Re: [PATCH bpf-next v3 1/8] bpf: Add generic attach/detach/query API for multi-progs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/10/23 9:10 AM, Daniel Borkmann wrote:
On 7/9/23 7:17 PM, Alexei Starovoitov wrote:
On Fri, Jul 07, 2023 at 07:24:48PM +0200, Daniel Borkmann wrote:
+
+#define BPF_MPROG_KEEP    0
+#define BPF_MPROG_SWAP    1
+#define BPF_MPROG_FREE    2

Please document how this is suppose to be used.
Patch 2 is using BPF_MPROG_FREE in tcx_entry_needs_release().
Where most of the code treats BPF_MPROG_SWAP and BPF_MPROG_FREE as equivalent.
I can guess what it's for, but a comment would help.

Ok, sounds good, will add a comment to these codes.

[...]
In the future, for cgroups, bpf_prog_run_array_cg() will keep explicit rcu_read_lock()
before accessing bpf_mprog_entry, right?
And bpf_mprog_commit() assumes that RCU protection.

Both yes.

All fine, but we need to document that mprog mechanism is not suitable for sleepable progs.

Ok, I'll add a comment.

I've added this as comment for bpf_mprog.h to address the ret codes, locking
and usage example :

/*
 * bpf_mprog framework:
 * ~~~~~~~~~~~~~~~~~~~~
 *
 * bpf_mprog is a generic layer for multi-program attachment. In-kernel users
 * of the bpf_mprog don't need to care about the dependency resolution
 * internals, they can just consume it with few API calls. Currently available
 * dependency directives are BPF_F_{BEFORE,AFTER} which enable insertion of
 * a BPF program or BPF link relative to an existing BPF program or BPF link
 * inside the multi-program array as well as prepend and append behavior if
 * no relative object was specified, see corresponding selftests for concrete
 * examples (e.g. tc_links and tc_opts test cases of test_progs).
 *
 * Usage of bpf_mprog_{attach,detach,query}() core APIs with pseudo code:
 *
 *  Attach case:
 *
 *   struct bpf_mprog_entry *entry, *peer;
 *   int ret;
 *
 *   // bpf_mprog user-side lock
 *   // fetch active @entry from attach location
 *   [...]
 *   ret = bpf_mprog_attach(entry, [...]);
 *   if (ret >= 0) {
 *       peer = bpf_mprog_peer(entry);
 *       if (bpf_mprog_swap_entries(ret))
 *           // swap @entry to @peer at attach location
 *       bpf_mprog_commit(entry);
 *       ret = 0;
 *   } else {
 *       // error path, bail out, propagate @ret
 *   }
 *   // bpf_mprog user-side unlock
 *
 *  Detach case:
 *
 *   struct bpf_mprog_entry *entry, *peer;
 *   bool release;
 *   int ret;
 *
 *   // bpf_mprog user-side lock
 *   // fetch active @entry from attach location
 *   [...]
 *   ret = bpf_mprog_detach(entry, [...]);
 *   if (ret >= 0) {
 *       release = ret == BPF_MPROG_FREE;
 *       peer = release ? NULL : bpf_mprog_peer(entry);
 *       if (bpf_mprog_swap_entries(ret))
 *           // swap entry to @peer at attach location
 *       bpf_mprog_commit(entry);
 *       if (release)
 *           // free bpf_mprog_bundle
 *       ret = 0;
 *   } else {
 *       // error path, bail out, propagate @ret
 *   }
 *   // bpf_mprog user-side unlock
 *
 *  Query case:
 *
 *   struct bpf_mprog_entry *entry;
 *   int ret;
 *
 *   // bpf_mprog user-side lock
 *   // fetch active @entry from attach location
 *   [...]
 *   ret = bpf_mprog_query(attr, uattr, entry);
 *   // bpf_mprog user-side unlock
 *
 *  Data/fast path:
 *
 *   struct bpf_mprog_entry *entry;
 *   struct bpf_mprog_fp *fp;
 *   struct bpf_prog *prog;
 *   int ret = [...];
 *
 *   rcu_read_lock();
 *   // fetch active @entry from attach location
 *   [...]
 *   bpf_mprog_foreach_prog(entry, fp, prog) {
 *       ret = bpf_prog_run(prog, [...]);
 *       // process @ret from program
 *   }
 *   [...]
 *   rcu_read_unlock();
 *
 * bpf_mprog_{attach,detach}() return codes:
 *
 * Negative return code means that an error occurred and the bpf_mprog_entry
 * has not been changed. The error should be propagated to the user. A non-
 * negative return code can be one of the following:
 *
 * BPF_MPROG_KEEP:
 *   The bpf_mprog_entry does not need a/b swap, the bpf_mprog_fp item has
 *   been replaced in the current active bpf_mprog_entry.
 *
 * BPF_MPROG_SWAP:
 *   The bpf_mprog_entry does need an a/b swap and must be updated to its
 *   peer entry (peer = bpf_mprog_peer(entry)) which has been populated to
 *   the new bpf_mprog_fp item configuration.
 *
 * BPF_MPROG_FREE:
 *   The bpf_mprog_entry now does not hold any non-NULL bpf_mprog_fp items
 *   anymore. The bpf_mprog_entry should be swapped with NULL and the
 *   corresponding bpf_mprog_bundle can be freed.
 *
 * bpf_mprog locking considerations:
 *
 * bpf_mprog_{attach,detach,query}() must be protected by an external lock
 * (like RTNL in case of tcx).
 *
 * bpf_mprog_entry pointer can be an __rcu annotated pointer (in case of tcx
 * the netdevice has tcx_ingress and tcx_egress __rcu pointer) which gets
 * updated via rcu_assign_pointer() pointing to the active bpf_mprog_entry of
 * the bpf_mprog_bundle.
 *
 * Fast path accesses the active bpf_mprog_entry within RCU critical section
 * (in case of tcx it runs in NAPI which provides RCU protection there,
 * other users might need explicit rcu_read_lock()). The bpf_mprog_commit()
 * assumes that RCU protection.
 *
 * The READ_ONCE()/WRITE_ONCE() pairing for bpf_mprog_fp's prog access is for
 * the replacement case where we don't swap the bpf_mprog_entry.
 */

Hope that helps,
Daniel




[Index of Archives]     [Linux Samsung SoC]     [Linux Rockchip SoC]     [Linux Actions SoC]     [Linux for Synopsys ARC Processors]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]


  Powered by Linux