On 8/1/2024 11:07 PM, Kalle Valo wrote: > Baochen Qiang <quic_bqiang@xxxxxxxxxxx> writes: > >> In transmit path, it is likely that the iova is not aligned to PCIe TLP >> max payload size, which is 128 for WCN7850. Normally in such cases hardware >> is expected to split the packet into several parts in a manner such that >> they, other than the first one, have aligned iova. However due to hardware >> limitations, WCN7850 does not behave like that properly with some specific >> unaligned iova in transmit path. This easily results in target hang in a >> KPI transmit test: packet send/receive failure, WMI command send timeout >> etc. Also fatal error seen in PCIe level: >> >> ... >> Capabilities: ... >> ... >> DevSta: ... FatalErr+ ... >> ... >> ... >> >> Work around this by manually moving/reallocating payload buffer such that >> we can map it to a 128 bytes aligned iova. The moving requires sufficient >> head room or tail room in skb: for the former we can do ourselves a favor >> by asking some extra bytes when registering with mac80211, while for the >> latter we can do nothing. >> >> Moving/reallocating buffer consumes additional CPU cycles, but the good news >> is that an aligned iova increases PCIe efficiency. In my tests on some X86 >> platforms the KPI results are almost consistent. >> >> Since this is seen only with WCN7850, add a new hardware parameter to >> differentiate from others. >> >> Tested-on: WCN7850 hw2.0 PCI WLAN.HMT.1.0.c5-00481-QCAHMTSWPL_V1.0_V2.0_SILICONZ-3 >> >> Signed-off-by: Baochen Qiang <quic_bqiang@xxxxxxxxxxx> > > [...] > >> --- a/drivers/net/wireless/ath/ath12k/dp_tx.c >> +++ b/drivers/net/wireless/ath/ath12k/dp_tx.c >> @@ -162,6 +162,60 @@ static int ath12k_dp_prepare_htt_metadata(struct sk_buff *skb) >> return 0; >> } >> >> +static void ath12k_dp_tx_move_payload(struct sk_buff *skb, >> + unsigned long delta, >> + bool head) >> +{ >> + unsigned long len = skb->len; >> + >> + if (head) { >> + skb_push(skb, delta); >> + memmove(skb->data, skb->data + delta, len); >> + skb_trim(skb, len); >> + } else { >> + skb_put(skb, delta); >> + memmove(skb->data + delta, skb->data, len); >> + skb_pull(skb, delta); >> + } >> +} > > I'm nitpicking, but usually booleans like the head variable here don't > help with readability. Having two separate functions would be easier to > read, but this is fine as it's so small. > >> @@ -279,6 +334,23 @@ int ath12k_dp_tx(struct ath12k *ar, struct ath12k_vif *arvif, >> goto fail_remove_tx_buf; >> } >> >> + if (iova_mask && >> + (unsigned long)skb->data & iova_mask) { >> + ret = ath12k_dp_tx_align_payload(ab, &skb); >> + if (ret) { >> + dev_warn_once(ab->dev, "failed to align TX buffer %d\n", ret); > > Why dev_warn_once()? I changed it to ath12k_warn() in the pending > branch. > >> --- a/drivers/net/wireless/ath/ath12k/hw.h >> +++ b/drivers/net/wireless/ath/ath12k/hw.h >> @@ -96,6 +96,8 @@ >> #define ATH12K_M3_FILE "m3.bin" >> #define ATH12K_REGDB_FILE_NAME "regdb.bin" >> >> +#define PCIE_MAX_PAYLOAD_SIZE 128 > > PCIE prefix implies that this is in PCI subsystem. I renamed it to > ATH12K_PCIE_MAX_PAYLOAD_SIZE. > > Please check my changes: > > https://git.kernel.org/pub/scm/linux/kernel/git/ath/ath.git/commit/?h=pending&id=b603c1e0d94fb1eb0576ef48ebe37c8c1ce86328 looks good to me, thanks. >