On 7/22/2024 2:54 PM, Baochen Qiang wrote: > > > On 7/19/2024 10:10 PM, Jeff Johnson wrote: >> On 7/14/2024 7:38 PM, Baochen Qiang wrote: >>> In transmit path, it is likely that the iova is not aligned to PCIe TLP >>> max payload size, which is 128 for WCN7850. Normally in such cases hardware >>> is expected to split the packet into several parts in a manner such that >>> they, other than the first one, have aligned iova. However due to hardware >>> limitations, WCN7850 does not behave like that properly with some specific >>> unaligned iova in transmit path. This easily results in target hang in a >>> KPI transmit test: packet send/receive failure, WMI command send timeout >>> etc. Also fatal error seen in PCIe level: >>> >>> ... >>> Capabilities: ... >>> ... >>> DevSta: ... FatalErr+ ... >>> ... >>> ... >>> >>> Work around this by manually moving/reallocating payload buffer such that >>> we can map it to a 128 bytes aligned iova. The moving requires sufficient >>> head room or tail room in skb: for the former we can do ourselves a favor >>> by asking some extra bytes when registering with mac80211, while for the >>> latter we can do nothing. >>> >>> Moving/reallocating buffer consumes additional CPU cycles, but the good news >>> is that an aligned iova increases PCIe efficiency. In my tests on some X86 >>> platforms the KPI results are almost consistent. >>> >>> Since this is seen only with WCN7850, add a new hardware parameter to >>> differentiate from others. >> >> I asked for expert opinion on this patch and received the following response. >> Baochen, can you take a look at this suggestion? >> >>> Aligning headers is sometimes done, but it appears the driver >>> doesn't support scatter gather? I think the author may want to advertise > right, ath12k does not support SG currently. > >>> scatter and linearize manually in the driver, to a correct offset. > is there an existing skb API or API combinations which can do that for me? I checked __skb_linearize() and it does not take an 'offset' argument. or do I need to implement it myself from a very low level basis? like (if required) allocating skb structure, allocating/aligning payload buffer, copying/freeing paged frag/frag list, etc.. > >>> Because now core is linearizing the skb in validate_xmit_skb() >>> and then the driver moves it a second time.. >> >> >