Re: [PATCH] wifi: ath11k: fix rx completion meta data corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Johan,

On Fri, Mar 21, 2025 at 9:55 AM Johan Hovold <johan+linaro@xxxxxxxxxx> wrote:
>
> Add the missing memory barrier to make sure that the REO dest ring
> descriptor is read after the head pointer to avoid using stale data on
> weakly ordered architectures like aarch64.
>
> This may fix the ring-buffer corruption worked around by commit
> f9fff67d2d7c ("wifi: ath11k: Fix SKB corruption in REO destination
> ring") by silently discarding data, and may possibly also address user
> reported errors like:
>
>         ath11k_pci 0006:01:00.0: msdu_done bit in attention is not set
>
> Tested-on: WCN6855 hw2.1 WLAN.HSP.1.1-03125-QCAHSPSWPL_V1_V2_SILICONZ_LITE-3.6510.41
>
> Fixes: d5c65159f289 ("ath11k: driver for Qualcomm IEEE 802.11ax devices")
> Cc: stable@xxxxxxxxxxxxxxx      # 5.6
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=218005
> Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx>
> ---
>
> As I reported here:
>
>         https://lore.kernel.org/lkml/Z9G5zEOcTdGKm7Ei@xxxxxxxxxxxxxxxxxxxx/
>
> the ath11k and ath12k appear to be missing a number of memory barriers
> that are required on weakly ordered architectures like aarch64 to avoid
> memory corruption issues.
>
> Here's a fix for one more such case which people already seem to be
> hitting.
>
> Note that I've seen one "msdu_done" bit not set warning also with this
> patch so whether it helps with that at all remains to be seen. I'm CCing
> Jens and Steev that see these warnings frequently and that may be able
> to help out with testing.
>
> Johan
>
>
>  drivers/net/wireless/ath/ath11k/dp_rx.c | 25 ++++++++++++++++---------
>  1 file changed, 16 insertions(+), 9 deletions(-)
>
> diff --git a/drivers/net/wireless/ath/ath11k/dp_rx.c b/drivers/net/wireless/ath/ath11k/dp_rx.c
> index 029ecf51c9ef..0a57b337e4c6 100644
> --- a/drivers/net/wireless/ath/ath11k/dp_rx.c
> +++ b/drivers/net/wireless/ath/ath11k/dp_rx.c
> @@ -2646,7 +2646,7 @@ int ath11k_dp_process_rx(struct ath11k_base *ab, int ring_id,
>         struct ath11k *ar;
>         struct hal_reo_dest_ring *desc;
>         enum hal_reo_dest_ring_push_reason push_reason;
> -       u32 cookie;
> +       u32 cookie, info0, rx_msdu_info0, rx_mpdu_info0;
>         int i;
>
>         for (i = 0; i < MAX_RADIOS; i++)
> @@ -2659,11 +2659,14 @@ int ath11k_dp_process_rx(struct ath11k_base *ab, int ring_id,
>  try_again:
>         ath11k_hal_srng_access_begin(ab, srng);
>
> +       /* Make sure descriptor is read after the head pointer. */
> +       dma_rmb();
> +
>         while (likely(desc =
>               (struct hal_reo_dest_ring *)ath11k_hal_srng_dst_get_next_entry(ab,
>                                                                              srng))) {
>                 cookie = FIELD_GET(BUFFER_ADDR_INFO1_SW_COOKIE,
> -                                  desc->buf_addr_info.info1);
> +                                  READ_ONCE(desc->buf_addr_info.info1));
>                 buf_id = FIELD_GET(DP_RXDMA_BUF_COOKIE_BUF_ID,
>                                    cookie);
>                 mac_id = FIELD_GET(DP_RXDMA_BUF_COOKIE_PDEV_ID, cookie);
> @@ -2692,8 +2695,9 @@ int ath11k_dp_process_rx(struct ath11k_base *ab, int ring_id,
>
>                 num_buffs_reaped[mac_id]++;
>
> +               info0 = READ_ONCE(desc->info0);
>                 push_reason = FIELD_GET(HAL_REO_DEST_RING_INFO0_PUSH_REASON,
> -                                       desc->info0);
> +                                       info0);
>                 if (unlikely(push_reason !=
>                              HAL_REO_DEST_RING_PUSH_REASON_ROUTING_INSTRUCTION)) {
>                         dev_kfree_skb_any(msdu);
> @@ -2701,18 +2705,21 @@ int ath11k_dp_process_rx(struct ath11k_base *ab, int ring_id,
>                         continue;
>                 }
>
> -               rxcb->is_first_msdu = !!(desc->rx_msdu_info.info0 &
> +               rx_msdu_info0 = READ_ONCE(desc->rx_msdu_info.info0);
> +               rx_mpdu_info0 = READ_ONCE(desc->rx_mpdu_info.info0);
> +
> +               rxcb->is_first_msdu = !!(rx_msdu_info0 &
>                                          RX_MSDU_DESC_INFO0_FIRST_MSDU_IN_MPDU);
> -               rxcb->is_last_msdu = !!(desc->rx_msdu_info.info0 &
> +               rxcb->is_last_msdu = !!(rx_msdu_info0 &
>                                         RX_MSDU_DESC_INFO0_LAST_MSDU_IN_MPDU);
> -               rxcb->is_continuation = !!(desc->rx_msdu_info.info0 &
> +               rxcb->is_continuation = !!(rx_msdu_info0 &
>                                            RX_MSDU_DESC_INFO0_MSDU_CONTINUATION);
>                 rxcb->peer_id = FIELD_GET(RX_MPDU_DESC_META_DATA_PEER_ID,
> -                                         desc->rx_mpdu_info.meta_data);
> +                                         READ_ONCE(desc->rx_mpdu_info.meta_data));
>                 rxcb->seq_no = FIELD_GET(RX_MPDU_DESC_INFO0_SEQ_NUM,
> -                                        desc->rx_mpdu_info.info0);
> +                                        rx_mpdu_info0);
>                 rxcb->tid = FIELD_GET(HAL_REO_DEST_RING_INFO0_RX_QUEUE_NUM,
> -                                     desc->info0);
> +                                     info0);
>
>                 rxcb->mac_id = mac_id;
>                 __skb_queue_tail(&msdu_list[mac_id], msdu);
> --
> 2.48.1
>

While the fix is definitely a fix, it does not seem to help with the
`msdu_done bit in attention is not set` message as I have seen it 43
times in the last 12 hours.

-- Steev





[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux