On Mon, Nov 7, 2022 at 2:02 PM Martin KaFai Lau <martin.lau@xxxxxxxxx> wrote: > > On 11/3/22 8:25 PM, Stanislav Fomichev wrote: > > > diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h > > index 59c9fd55699d..dba857f212d7 100644 > > --- a/include/linux/skbuff.h > > +++ b/include/linux/skbuff.h > > @@ -4217,9 +4217,13 @@ static inline bool skb_metadata_differs(const struct sk_buff *skb_a, > > true : __skb_metadata_differs(skb_a, skb_b, len_a); > > } > > > > +void skb_metadata_import_from_xdp(struct sk_buff *skb, size_t len); > > + > > static inline void skb_metadata_set(struct sk_buff *skb, u8 meta_len) > > { > > skb_shinfo(skb)->meta_len = meta_len; > > + if (meta_len) > > + skb_metadata_import_from_xdp(skb, meta_len); > > } > > > [ ... ] > > > +struct xdp_to_skb_metadata { > > + u32 magic; /* xdp_metadata_magic */ > > + u64 rx_timestamp; > > +} __randomize_layout; > > + > > +struct bpf_patch; > > + > > [ ... ] > > > +void skb_metadata_import_from_xdp(struct sk_buff *skb, size_t len) > > +{ > > + struct xdp_to_skb_metadata *meta = (void *)(skb_mac_header(skb) - len); > > + > > + /* Optional SKB info, currently missing: > > + * - HW checksum info (skb->ip_summed) > > + * - HW RX hash (skb_set_hash) > > + * - RX ring dev queue index (skb_record_rx_queue) > > + */ > > + > > + if (len != sizeof(struct xdp_to_skb_metadata)) > > + return; > > + > > + if (meta->magic != xdp_metadata_magic) > > + return; > > + > > + if (meta->rx_timestamp) { > > + *skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){ > > + .hwtstamp = ns_to_ktime(meta->rx_timestamp), > > + }; > > + } > > +} > > Considering the metadata will affect the gro, should the meta be cleared after > importing to the skb? Yeah, good suggestion, will clear it here. > [ ... ] > > > +/* Since we're not actually doing a call but instead rewriting > > + * in place, we can only afford to use R0-R5 scratch registers. > > + * > > + * We reserve R1 for bpf_xdp_metadata_export_to_skb and let individual > > + * metadata kfuncs use only R0,R4-R5. > > + * > > + * The above also means we _cannot_ easily call any other helper/kfunc > > + * because there is no place for us to preserve our R1 argument; > > + * existing R6-R9 belong to the callee. > > + */ > > +void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch) > > +{ > > + u32 func_id; > > + > > + /* > > + * The code below generates the following: > > + * > > + * void bpf_xdp_metadata_export_to_skb(struct xdp_md *ctx) > > + * { > > + * struct xdp_to_skb_metadata *meta; > > + * int ret; > > + * > > + * ret = bpf_xdp_adjust_meta(ctx, -sizeof(*meta)); > > + * if (!ret) > > + * return; > > + * > > + * meta = ctx->data_meta; > > + * meta->magic = xdp_metadata_magic; > > + * meta->rx_timestamp = bpf_xdp_metadata_rx_timestamp(ctx); > > + * } > > + * > > + */ > > + > > + bpf_patch_append(patch, > > + /* r2 = ((struct xdp_buff *)r1)->data_meta; */ > > + BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, > > + offsetof(struct xdp_buff, data_meta)), > > + /* r3 = ((struct xdp_buff *)r1)->data; */ > > + BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_1, > > + offsetof(struct xdp_buff, data)), > > + /* if (data_meta != data) return; > > + * > > + * data_meta > data: xdp_data_meta_unsupported() > > + * data_meta < data: already used, no need to touch > > + */ > > + BPF_JMP_REG(BPF_JNE, BPF_REG_2, BPF_REG_3, S16_MAX), > > + > > + /* r2 -= sizeof(struct xdp_to_skb_metadata); */ > > + BPF_ALU64_IMM(BPF_SUB, BPF_REG_2, > > + sizeof(struct xdp_to_skb_metadata)), > > + /* r3 = ((struct xdp_buff *)r1)->data_hard_start; */ > > + BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_1, > > + offsetof(struct xdp_buff, data_hard_start)), > > + /* r3 += sizeof(struct xdp_frame) */ > > + BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, > > + sizeof(struct xdp_frame)), > > + /* if (data-sizeof(struct xdp_to_skb_metadata) < data_hard_start+sizeof(struct xdp_frame)) return; */ > > + BPF_JMP_REG(BPF_JLT, BPF_REG_2, BPF_REG_3, S16_MAX), > > + > > + /* ((struct xdp_buff *)r1)->data_meta = r2; */ > > + BPF_STX_MEM(BPF_DW, BPF_REG_1, BPF_REG_2, > > + offsetof(struct xdp_buff, data_meta)), > > + > > + /* *((struct xdp_to_skb_metadata *)r2)->magic = xdp_metadata_magic; */ > > + BPF_ST_MEM(BPF_W, BPF_REG_2, > > + offsetof(struct xdp_to_skb_metadata, magic), > > + xdp_metadata_magic), > > + ); > > + > > + /* r0 = bpf_xdp_metadata_rx_timestamp(ctx); */ > > + func_id = xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP); > > + prog->aux->xdp_kfunc_ndo->ndo_unroll_kfunc(prog, func_id, patch); > > + > > + bpf_patch_append(patch, > > + /* r2 = ((struct xdp_buff *)r1)->data_meta; */ > > + BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1, > > + offsetof(struct xdp_buff, data_meta)), > > + /* *((struct xdp_to_skb_metadata *)r2)->rx_timestamp = r0; */ > > + BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_0, > > + offsetof(struct xdp_to_skb_metadata, rx_timestamp)), > > Can the xdp prog still change the metadata through xdp->data_meta? tbh, I am not > sure it is solid enough by asking the xdp prog not to use the same random number > in its own metadata + not to change the metadata through xdp->data_meta after > calling bpf_xdp_metadata_export_to_skb(). What do you think the usecase here might be? Or are you suggesting we reject further access to data_meta after bpf_xdp_metadata_export_to_skb somehow? If we want to let the programs override some of this bpf_xdp_metadata_export_to_skb() metadata, it feels like we can add more kfuncs instead of exposing the layout? bpf_xdp_metadata_export_to_skb(ctx); bpf_xdp_metadata_export_skb_hash(ctx, 1234); ... > Does xdp_to_skb_metadata have a use case for XDP_PASS (like patch 7) or the > xdp_to_skb_metadata can be limited to XDP_REDIRECT only? XDP_PASS cases where we convert xdp_buff into skb in the drivers right now usually have C code to manually pull out the metadata (out of hw desc) and put it into skb. So, currently, if we're calling bpf_xdp_metadata_export_to_skb() for XDP_PASS, we're doing a double amount of work: skb_metadata_import_from_xdp first, then custom driver code second. In theory, maybe we should completely skip drivers custom parsing when there is a prog with BPF_F_XDP_HAS_METADATA? Then both xdp->skb paths (XDP_PASS+XDP_REDIRECT) will be bpf-driven and won't require any mental work (plus, the drivers won't have to care either in the future). WDYT? > > + ); > > + > > + bpf_patch_resolve_jmp(patch); > > +} > > + > > static int __init xdp_metadata_init(void) > > { > > + xdp_metadata_magic = get_random_u32() | 1; > > return register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &xdp_metadata_kfunc_set); > > } > > late_initcall(xdp_metadata_init);