On 3/25/2024 04:09, Song Yoong Siang wrote:
This patch adds support to per-packet Tx hardware timestamp request to AF_XDP zero-copy packet via XDP Tx metadata framework. Please note that user needs to enable Tx HW timestamp capability via igc_ioctl() with SIOCSHWTSTAMP cmd before sending xsk Tx hardware timestamp request. Same as implementation in RX timestamp XDP hints kfunc metadata, Timer 0 (adjustable clock) is used in xsk Tx hardware timestamp. i225/i226 have four sets of timestamping registers. *skb and *xsk_tx_buffer pointers are used to indicate whether the timestamping register is already occupied. Furthermore, a boolean variable named xsk_pending_ts is used to hold the transmit completion until the tx hardware timestamp is ready. This is because, for i225/i226, the timestamp notification event comes some time after the transmit completion event. The driver will retrigger hardware irq to clean the packet after retrieve the tx hardware timestamp. Besides, xsk_meta is added into struct igc_tx_timestamp_request as a hook to the metadata location of the transmit packet. When the Tx timestamp interrupt is fired, the interrupt handler will copy the value of Tx hwts into metadata location via xsk_tx_metadata_complete(). This patch is tested with tools/testing/selftests/bpf/xdp_hw_metadata on Intel ADL-S platform. Below are the test steps and results. Test Step 1: Run xdp_hw_metadata app ./xdp_hw_metadata <iface> > /dev/shm/result.log Test Step 2: Enable Tx hardware timestamp hwstamp_ctl -i <iface> -t 1 -r 1 Test Step 3: Run ptp4l and phc2sys for time synchronization Test Step 4: Generate UDP packets with 1ms interval for 10s trafgen --dev <iface> '{eth(da=<addr>), udp(dp=9091)}' -t 1ms -n 10000 Test Step 5: Rerun Step 1-3 with 10s iperf3 as background traffic Test Step 6: Rerun Step 1-4 with 10s iperf3 as background traffic Based on iperf3 results below, the impact of holding tx completion to throughput is not observable. Result of last UDP packet (no. 10000) in Step 4: poll: 1 (0) skip=99 fail=0 redir=10000 xsk_ring_cons__peek: 1 0x5640a37972d0: rx_desc[9999]->addr=f2110 addr=f2110 comp_addr=f2110 EoP rx_hash: 0x2049BE1D with RSS type:0x1 HW RX-time: 1679819246792971268 (sec:1679819246.7930) delta to User RX-time sec:0.0000 (14.990 usec) XDP RX-time: 1679819246792981987 (sec:1679819246.7930) delta to User RX-time sec:0.0000 (4.271 usec) No rx_vlan_tci or rx_vlan_proto, err=-95 0x5640a37972d0: ping-pong with csum=ab19 (want 315b) csum_start=34 csum_offset=6 0x5640a37972d0: complete tx idx=9999 addr=f010 HW TX-complete-time: 1679819246793036971 (sec:1679819246.7930) delta to User TX-complete-time sec:0.0001 (77.656 usec) XDP RX-time: 1679819246792981987 (sec:1679819246.7930) delta to User TX-complete-time sec:0.0001 (132.640 usec) HW RX-time: 1679819246792971268 (sec:1679819246.7930) delta to HW TX-complete-time sec:0.0001 (65.703 usec) 0x5640a37972d0: complete rx idx=10127 addr=f2110 Result of iperf3 without tx hwts request in step 5: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.74 GBytes 2.36 Gbits/sec 0 sender [ 5] 0.00-10.05 sec 2.74 GBytes 2.34 Gbits/sec receiver Result of iperf3 running parallel with trafgen command in step 6: [ ID] Interval Transfer Bitrate Retr [ 5] 0.00-10.00 sec 2.74 GBytes 2.36 Gbits/sec 0 sender [ 5] 0.00-10.04 sec 2.74 GBytes 2.34 Gbits/sec receiver Co-developed-by: Lai Peter Jun Ann <jun.ann.lai@xxxxxxxxx> Signed-off-by: Lai Peter Jun Ann <jun.ann.lai@xxxxxxxxx> Signed-off-by: Song Yoong Siang <yoong.siang.song@xxxxxxxxx> Acked-by: John Fastabend <john.fastabend@xxxxxxxxx> --- V1: https://patchwork.kernel.org/project/netdevbpf/patch/20231215162158.951925-1-yoong.siang.song@xxxxxxxxx/ V2: https://patchwork.kernel.org/project/netdevbpf/cover/20240301162348.898619-1-yoong.siang.song@xxxxxxxxx/ V3: https://patchwork.kernel.org/project/netdevbpf/cover/20240303083225.1184165-1-yoong.siang.song@xxxxxxxxx/ changelog: V1 -> V2 - In struct igc_tx_timestamp_request, keep a pointer to igc_tx_buffer, instead of pointing xsk_pending_ts (Vinicius). - In struct igc_tx_timestamp_request, introduce buffer_type to indicate whether skb or igc_tx_buffer pointer should be use (Vinicius). - In struct igc_metadata_request, remove igc_adapter pointer (Vinicius). - When request tx hwts, copy the value of cmd_type, instead of using pointer (Vinicius). - For boolean variable, use true and false, instead of 1 and 0 (Vinicius). - In igc_xsk_request_timestamp(), make an early return if none of the 4 ts registers is available (Vinicius). - Create helper functions to clear tx buffer and skb for tstamp (John). - Perform throughput test with mix traffic (Vinicius & John). V2 -> V3 - Improve tstamp reg searching loop for better readability (John). - In igc_ptp_free_tx_buffer(), add comment to inform user that tstamp->xsk_tx_buffer and tstamp->skb are in union (John). V3 -> V4 - Add protection with xp_tx_metadata_enabled (Kurt & Maciej). --- --- drivers/net/ethernet/intel/igc/igc.h | 71 ++++++++------ drivers/net/ethernet/intel/igc/igc_main.c | 113 ++++++++++++++++++++-- drivers/net/ethernet/intel/igc/igc_ptp.c | 51 ++++++++-- 3 files changed, 195 insertions(+), 40 deletions(-)
Tested-by: Naama Meir <naamax.meir@xxxxxxxxxxxxxxx>