On Mon, Nov 13, 2023 at 5:16 AM Jesper Dangaard Brouer <hawk@xxxxxxxxxx> wrote: > > > > On 11/2/23 23:58, Stanislav Fomichev wrote: > > diff --git a/include/uapi/linux/if_xdp.h b/include/uapi/linux/if_xdp.h > > index 2ecf79282c26..b0ee7ad19b51 100644 > > --- a/include/uapi/linux/if_xdp.h > > +++ b/include/uapi/linux/if_xdp.h > > @@ -106,6 +106,41 @@ struct xdp_options { > > #define XSK_UNALIGNED_BUF_ADDR_MASK \ > > ((1ULL << XSK_UNALIGNED_BUF_OFFSET_SHIFT) - 1) > > > > +/* Request transmit timestamp. Upon completion, put it into tx_timestamp > > + * field of struct xsk_tx_metadata. > > + */ > > +#define XDP_TXMD_FLAGS_TIMESTAMP (1 << 0) > > + > > +/* Request transmit checksum offload. Checksum start position and offset > > + * are communicated via csum_start and csum_offset fields of struct > > + * xsk_tx_metadata. > > + */ > > +#define XDP_TXMD_FLAGS_CHECKSUM (1 << 1) > > + > > +/* AF_XDP offloads request. 'request' union member is consumed by the driver > > + * when the packet is being transmitted. 'completion' union member is > > + * filled by the driver when the transmit completion arrives. > > + */ > > +struct xsk_tx_metadata { > > + union { > > + struct { > > + __u32 flags; > > + > > + /* XDP_TXMD_FLAGS_CHECKSUM */ > > + > > + /* Offset from desc->addr where checksumming should start. */ > > + __u16 csum_start; > > + /* Offset from csum_start where checksum should be stored. */ > > + __u16 csum_offset; > > + } request; > > + > > + struct { > > + /* XDP_TXMD_FLAGS_TIMESTAMP */ > > + __u64 tx_timestamp; > > + } completion; > > + }; > > +}; > > This looks wrong to me. It looks like member @flags is not avail at > completion time. At completion time, I assume we also want to know if > someone requested to get the timestamp for this packet (else we could > read garbage). I've moved the parts that are preserved across tx and tx completion into xsk_tx_metadata_compl. This is to address Magnus/Maciej feedback where userspace might race with the kernel. See: https://lore.kernel.org/bpf/ZNoJenzKXW5QSR3E@boxer/ > Another thing (I've raised this before): It would be really practical to > store an u64 opaque value at TX and then read it at Completion time. > One use-case is a forwarding application storing HW RX-time and > comparing this to TX completion time to deduce the time spend processing > the packet. This can be another member, right? But note that extending xsk_tx_metadata_compl might be a bit complicated because drivers have to carry this info somewhere. So we have to balance the amount of passed data between the tx and the completion.