On Wednesday, January 8, 2025 12:50 AM, Stanislav Fomichev <stfomichev@xxxxxxxxx> wrote: >On 01/06, Song Yoong Siang wrote: >> Extend the XDP Tx metadata framework so that user can requests launch time >> hardware offload, where the Ethernet device will schedule the packet for >> transmission at a pre-determined time called launch time. The value of >> launch time is communicated from user space to Ethernet driver via >> launch_time field of struct xsk_tx_metadata. >> >> Suggested-by: Stanislav Fomichev <sdf@xxxxxxxxxx> Hi Stanislav Fomichev, Thanks for your review comments. I notice that you have two emails: sdf@xxxxxxxxxx & stfomichev@xxxxxxxxx Which one I should use in the suggested-by tag? >> Signed-off-by: Song Yoong Siang <yoong.siang.song@xxxxxxxxx> >> --- >> Documentation/netlink/specs/netdev.yaml | 4 ++ >> Documentation/networking/xsk-tx-metadata.rst | 64 ++++++++++++++++++++ >> include/net/xdp_sock.h | 10 +++ >> include/net/xdp_sock_drv.h | 1 + >> include/uapi/linux/if_xdp.h | 10 +++ >> include/uapi/linux/netdev.h | 3 + >> net/core/netdev-genl.c | 2 + >> net/xdp/xsk.c | 3 + >> tools/include/uapi/linux/if_xdp.h | 10 +++ >> tools/include/uapi/linux/netdev.h | 3 + >> 10 files changed, 110 insertions(+) >> >> diff --git a/Documentation/netlink/specs/netdev.yaml >b/Documentation/netlink/specs/netdev.yaml >> index cbb544bd6c84..e59c8a14f7d1 100644 >> --- a/Documentation/netlink/specs/netdev.yaml >> +++ b/Documentation/netlink/specs/netdev.yaml >> @@ -70,6 +70,10 @@ definitions: >> name: tx-checksum >> doc: >> L3 checksum HW offload is supported by the driver. >> + - >> + name: tx-launch-time >> + doc: >> + Launch time HW offload is supported by the driver. >> - >> name: queue-type >> type: enum >> diff --git a/Documentation/networking/xsk-tx-metadata.rst >b/Documentation/networking/xsk-tx-metadata.rst >> index e76b0cfc32f7..3cec089747ce 100644 >> --- a/Documentation/networking/xsk-tx-metadata.rst >> +++ b/Documentation/networking/xsk-tx-metadata.rst >> @@ -50,6 +50,10 @@ The flags field enables the particular offload: >> checksum. ``csum_start`` specifies byte offset of where the checksumming >> should start and ``csum_offset`` specifies byte offset where the >> device should store the computed checksum. >> +- ``XDP_TXMD_FLAGS_LAUNCH_TIME``: requests the device to schedule the >> + packet for transmission at a pre-determined time called launch time. The >> + value of launch time is indicated by ``launch_time`` field of >> + ``union xsk_tx_metadata``. >> >> Besides the flags above, in order to trigger the offloads, the first >> packet's ``struct xdp_desc`` descriptor should set ``XDP_TX_METADATA`` >> @@ -65,6 +69,65 @@ In this case, when running in ``XDK_COPY`` mode, the TX >checksum >> is calculated on the CPU. Do not enable this option in production because >> it will negatively affect performance. >> >> +Launch Time >> +=========== >> + >> +The value of the requested launch time should be based on the device's PTP >> +Hardware Clock (PHC) to ensure accuracy. AF_XDP takes a different data path >> +compared to the ETF queuing discipline, which organizes packets and delays >> +their transmission. Instead, AF_XDP immediately hands off the packets to >> +the device driver without rearranging their order or holding them prior to >> +transmission. In scenarios where the launch time offload feature is >> +disabled, the device driver is expected to disregard the launch time >> +request. For correct interpretation and meaningful operation, the launch >> +time should never be set to a value larger than the farthest programmable >> +time in the future (the horizon). Different devices have different hardware >> +limitations on the launch time offload feature. >> + >> +stmmac driver >> +------------- >> + >> +For stmmac, TSO and launch time (TBS) features are mutually exclusive for >> +each individual Tx Queue. By default, the driver configures Tx Queue 0 to >> +support TSO and the rest of the Tx Queues to support TBS. The launch time >> +hardware offload feature can be enabled or disabled by using the tc-etf >> +command to call the driver's ndo_setup_tc() callback. >> + >> +The value of the launch time that is programmed in the Enhanced Normal >> +Transmit Descriptors is a 32-bit value, where the most significant 8 bits >> +represent the time in seconds and the remaining 24 bits represent the time >> +in 256 ns increments. The programmed launch time is compared against the >> +PTP time (bits[39:8]) and rolls over after 256 seconds. Therefore, the >> +horizon of the launch time for dwmac4 and dwxlgmac2 is 128 seconds in the >> +future. >> + >> +The stmmac driver maintains FIFO behavior and does not perform packet >> +reordering. This means that a packet with a launch time request will block >> +other packets in the same Tx Queue until it is transmitted. >> + >> +igc driver >> +---------- >> + >> +For igc, all four Tx Queues support the launch time feature. The launch >> +time hardware offload feature can be enabled or disabled by using the >> +tc-etf command to call the driver's ndo_setup_tc() callback. When entering >> +TSN mode, the igc driver will reset the device and create a default Qbv >> +schedule with a 1-second cycle time, with all Tx Queues open at all times. >> + >> +The value of the launch time that is programmed in the Advanced Transmit >> +Context Descriptor is a relative offset to the starting time of the Qbv >> +transmission window of the queue. The Frst flag of the descriptor can be >> +set to schedule the packet for the next Qbv cycle. Therefore, the horizon >> +of the launch time for i225 and i226 is the ending time of the next cycle >> +of the Qbv transmission window of the queue. For example, when the Qbv >> +cycle time is set to 1 second, the horizon of the launch time ranges >> +from 1 second to 2 seconds, depending on where the Qbv cycle is currently >> +running. >> + >> +The igc driver maintains FIFO behavior and does not perform packet >> +reordering. This means that a packet with a launch time request will block >> +other packets in the same Tx Queue until it is transmitted. > >Since two devices we initially support are using FIFO mode, should we more >explicitly target this case? Maybe even call netdev features >tx-launch-time-fifo? In the future, if/when we get support timing-wheel-like >queues, we can export another tx-launch-time-wheel? > >It seems important for the userspace to know which mode it's running. >In a fifo mode, it might make sense to allocate separate queues >for scheduling things far into the future/etc. You are right, user should isolate one queue for scheduling things far into future and use other queue for normal traffic. > >Thoughts? No code changes required, just more explicitly state the >expectations. Agree with you, let me change the name from tx-launch-time to tx-launch-time-fifo to explicitly state the fifo behavior. Thanks & Regards Siang