On Wed, Nov 9, 2022 at 3:21 AM Toke Høiland-Jørgensen <toke@xxxxxxxxxx> wrote: > > Stanislav Fomichev <sdf@xxxxxxxxxx> writes: > > > xskxceiver conveniently setups up veth pairs so it seems logical > > to use veth as an example for some of the metadata handling. > > > > We timestamp skb right when we "receive" it, store its > > pointer in new veth_xdp_buff wrapper and generate BPF bytecode to > > reach it from the BPF program. > > > > This largely follows the idea of "store some queue context in > > the xdp_buff/xdp_frame so the metadata can be reached out > > from the BPF program". > > > > Cc: John Fastabend <john.fastabend@xxxxxxxxx> > > Cc: David Ahern <dsahern@xxxxxxxxx> > > Cc: Martin KaFai Lau <martin.lau@xxxxxxxxx> > > Cc: Jakub Kicinski <kuba@xxxxxxxxxx> > > Cc: Willem de Bruijn <willemb@xxxxxxxxxx> > > Cc: Jesper Dangaard Brouer <brouer@xxxxxxxxxx> > > Cc: Anatoly Burakov <anatoly.burakov@xxxxxxxxx> > > Cc: Alexander Lobakin <alexandr.lobakin@xxxxxxxxx> > > Cc: Magnus Karlsson <magnus.karlsson@xxxxxxxxx> > > Cc: Maryam Tahhan <mtahhan@xxxxxxxxxx> > > Cc: xdp-hints@xxxxxxxxxxxxxxx > > Cc: netdev@xxxxxxxxxxxxxxx > > Signed-off-by: Stanislav Fomichev <sdf@xxxxxxxxxx> > > --- > > drivers/net/veth.c | 31 +++++++++++++++++++++++++++++++ > > 1 file changed, 31 insertions(+) > > > > diff --git a/drivers/net/veth.c b/drivers/net/veth.c > > index 917ba57453c1..0e629ceb087b 100644 > > --- a/drivers/net/veth.c > > +++ b/drivers/net/veth.c > > @@ -25,6 +25,7 @@ > > #include <linux/filter.h> > > #include <linux/ptr_ring.h> > > #include <linux/bpf_trace.h> > > +#include <linux/bpf_patch.h> > > #include <linux/net_tstamp.h> > > > > #define DRV_NAME "veth" > > @@ -118,6 +119,7 @@ static struct { > > > > struct veth_xdp_buff { > > struct xdp_buff xdp; > > + struct sk_buff *skb; > > }; > > > > static int veth_get_link_ksettings(struct net_device *dev, > > @@ -602,6 +604,7 @@ static struct xdp_frame *veth_xdp_rcv_one(struct veth_rq *rq, > > > > xdp_convert_frame_to_buff(frame, xdp); > > xdp->rxq = &rq->xdp_rxq; > > + vxbuf.skb = NULL; > > > > act = bpf_prog_run_xdp(xdp_prog, xdp); > > > > @@ -826,6 +829,7 @@ static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq, > > > > orig_data = xdp->data; > > orig_data_end = xdp->data_end; > > + vxbuf.skb = skb; > > > > act = bpf_prog_run_xdp(xdp_prog, xdp); > > > > @@ -942,6 +946,7 @@ static int veth_xdp_rcv(struct veth_rq *rq, int budget, > > struct sk_buff *skb = ptr; > > > > stats->xdp_bytes += skb->len; > > + __net_timestamp(skb); > > skb = veth_xdp_rcv_skb(rq, skb, bq, stats); > > if (skb) { > > if (skb_shared(skb) || skb_unclone(skb, GFP_ATOMIC)) > > @@ -1665,6 +1670,31 @@ static int veth_xdp(struct net_device *dev, struct netdev_bpf *xdp) > > } > > } > > > > +static void veth_unroll_kfunc(const struct bpf_prog *prog, u32 func_id, > > + struct bpf_patch *patch) > > +{ > > + if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP_SUPPORTED)) { > > + /* return true; */ > > + bpf_patch_append(patch, BPF_MOV64_IMM(BPF_REG_0, 1)); > > + } else if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP)) { > > + bpf_patch_append(patch, > > + /* r5 = ((struct veth_xdp_buff *)r1)->skb; */ > > + BPF_LDX_MEM(BPF_DW, BPF_REG_5, BPF_REG_1, > > + offsetof(struct veth_xdp_buff, skb)), > > + /* if (r5 == NULL) { */ > > + BPF_JMP_IMM(BPF_JNE, BPF_REG_5, 0, 2), > > + /* return 0; */ > > + BPF_MOV64_IMM(BPF_REG_0, 0), > > + BPF_JMP_A(1), > > + /* } else { */ > > + /* return ((struct sk_buff *)r5)->tstamp; */ > > + BPF_LDX_MEM(BPF_DW, BPF_REG_0, BPF_REG_5, > > + offsetof(struct sk_buff, tstamp)), > > + /* } */ > > I don't think it's realistic to expect driver developers to write this > level of BPF instructions for everything. With the 'patch' thing it > should be feasible to write some helpers that driver developers can use, > right? E.g., this one could be: > > bpf_read_context_member_u64(size_t ctx_offset, size_t member_offset) > > called as: > > bpf_read_context_member_u64(offsetof(struct veth_xdp_buff, skb), offsetof(struct sk_buff, tstamp)); > > or with some macro trickery we could even hide the offsetof so you just > pass in types and member names? Definitely; let's start with the one you're proposing, we'll figure out the rest as we go; thx! > -Toke >