On Thu, Jun 22, 2023 at 2:47 PM Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote: > > On Thu, Jun 22, 2023 at 1:13 PM Stanislav Fomichev <sdf@xxxxxxxxxx> wrote: > > > > On Thu, Jun 22, 2023 at 12:58 PM Alexei Starovoitov > > <alexei.starovoitov@xxxxxxxxx> wrote: > > > > > > On Wed, Jun 21, 2023 at 10:02:44AM -0700, Stanislav Fomichev wrote: > > > > WIP, not tested, only to show the overall idea. > > > > Non-AF_XDP paths are marked with 'false' for now. > > > > > > > > Cc: netdev@xxxxxxxxxxxxxxx > > > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> > > > > --- > > > > .../net/ethernet/mellanox/mlx5/core/en/txrx.h | 11 +++ > > > > .../net/ethernet/mellanox/mlx5/core/en/xdp.c | 96 ++++++++++++++++++- > > > > .../net/ethernet/mellanox/mlx5/core/en/xdp.h | 9 +- > > > > .../ethernet/mellanox/mlx5/core/en/xsk/tx.c | 3 + > > > > .../net/ethernet/mellanox/mlx5/core/en_tx.c | 16 ++++ > > > > .../net/ethernet/mellanox/mlx5/core/main.c | 26 ++++- > > > > 6 files changed, 156 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h > > > > index 879d698b6119..e4509464e0b1 100644 > > > > --- a/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h > > > > +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/txrx.h > > > > @@ -6,6 +6,7 @@ > > > > > > > > #include "en.h" > > > > #include <linux/indirect_call_wrapper.h> > > > > +#include <net/devtx.h> > > > > > > > > #define MLX5E_TX_WQE_EMPTY_DS_COUNT (sizeof(struct mlx5e_tx_wqe) / MLX5_SEND_WQE_DS) > > > > > > > > @@ -506,4 +507,14 @@ static inline struct mlx5e_mpw_info *mlx5e_get_mpw_info(struct mlx5e_rq *rq, int > > > > > > > > return (struct mlx5e_mpw_info *)((char *)rq->mpwqe.info + array_size(i, isz)); > > > > } > > > > + > > > > +struct mlx5e_devtx_frame { > > > > + struct devtx_frame frame; > > > > + struct mlx5_cqe64 *cqe; /* tx completion */ > > > > > > cqe is only valid at completion. > > > > > > > + struct mlx5e_tx_wqe *wqe; /* tx */ > > > > > > wqe is only valid at submission. > > > > > > imo that's a very clear sign that this is not a generic datastructure. > > > The code is trying hard to make 'frame' part of it look common, > > > but it won't help bpf prog to be 'generic'. > > > It is still going to precisely coded for completion vs submission. > > > Similarly a bpf prog for completion in veth will be different than bpf prog for completion in mlx5. > > > As I stated earlier this 'generalization' and 'common' datastructure only adds code complexity. > > > > The reason I went with this abstract context is to allow the programs > > to be attached to the different devices. > > For example, the xdp_hw_metadata we currently have is not really tied > > down to the particular implementation. > > If every hook declaration looks different, it seems impossible to > > create portable programs. > > > > The frame part is not really needed, we can probably rename it to ctx > > and pass data/frags over the arguments? > > > > struct devtx_ctx { > > struct net_device *netdev; > > /* the devices will be able to create wrappers to stash descriptor pointers */ > > }; > > void veth_devtx_submit(struct devtx_ctx *ctx, void *data, u16 len, u8 > > meta_len, struct skb_shared_info *sinfo); > > > > But striving to have a similar hook declaration seems useful to > > program portability sake? > > portability across what ? > 'timestamp' on veth doesn't have a real use. It's testing only. > Even testing is a bit dubious. > I can see a need for bpf prog to run in the datacenter on mlx, brcm > and whatever other nics, but they will have completely different > hw descriptors. timestamp kfuncs to request/read can be common, > but to read the descriptors bpf prog authors would need to write > different code anyway. > So kernel code going out its way to present somewhat common devtx_ctx > just doesn't help. It adds code to the kernel, but bpf prog still > has to be tailored for mlx and brcm differently. Isn't it the same discussion/arguments we had during the RX series? We want to provide common sane interfaces/abstractions via kfuncs. That will make most BPF programs portable from mlx to brcm (for example) without doing a rewrite. We're also exposing raw (readonly) descriptors (via that get_ctx helper) to the users who know what to do with them. Most users don't know what to do with raw descriptors; the specs are not public; things can change depending on fw version/etc/etc. So the progs that touch raw descriptors are not the primary use-case. (that was the tl;dr for rx part, seems like it applies here?) Let's maybe discuss that mlx5 example? Are you proposing to do something along these lines? void mlx5e_devtx_submit(struct mlx5e_tx_wqe *wqe); void mlx5e_devtx_complete(struct mlx5_cqe64 *cqe); If yes, I'm missing how we define the common kfuncs in this case. The kfuncs need to have some common context. We're defining them with: bpf_devtx_<kfunc>(const struct devtx_frame *ctx);