On 6/6/23 6:36 AM, Ben Greear wrote:
It seems that when tx fails (due to TX_STATUS_FAIL_LONG_LIMIT in this case), the retry-count is sometimes reported as zero. I would expect that it actually must have retried many times (15 it seems?). At least for fail-long-limit failures, is there a known constant amount of times the firmware retransmits the frame before giving up? If so, I can just hack that value into the retry counter in this case. Debug patch: [greearb@ben-dt5 iwlwifi]$ git diff diff --git a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c index 1cddc65dd51e..6544dabb092b 100644 --- a/drivers/net/wireless/intel/iwlwifi/mvm/tx.c +++ b/drivers/net/wireless/intel/iwlwifi/mvm/tx.c @@ -1929,8 +1929,14 @@ static void iwl_mvm_rx_tx_cmd_single(struct iwl_mvm *mvm, mvm->ethtool_stats.tx_mpdu_attempts += info->status.rates[0].count; mvm->ethtool_stats.tx_mpdu_retry += tx_resp->failure_frame; if (cb.flags & IWL_TX_CB_TXO_USED) { + u32 idx = status & TX_STATUS_MSK; + mvm->ethtool_stats.txo_tx_mpdu_attempts += info->status.rates[0].count; mvm->ethtool_stats.txo_tx_mpdu_retry += tx_resp->failure_frame; + if (idx != TX_STATUS_SUCCESS) { + pr_info("txo tx status failed: %d tx_resp->failure_frame: %d\n", + idx, tx_resp->failure_frame); + } } iwl_mvm_hwrate_to_tx_status(mvm, mvm->fw, From dmesg: Jun 06 06:29:40 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15 Jun 06 06:29:43 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15 Jun 06 06:29:48 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15 Jun 06 06:29:49 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15 Jun 06 06:29:52 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15 Jun 06 06:29:57 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 0 Jun 06 06:29:57 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 0 Jun 06 06:29:58 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 0 Jun 06 06:30:01 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15 Jun 06 06:30:02 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15 Jun 06 06:30:03 ct523c-0b29 kernel: txo tx status failed: 131 tx_resp->failure_frame: 15
I think it has some other weirdness too. I added printout to show any non-zero failure_frame: + if (idx != TX_STATUS_SUCCESS || tx_resp->failure_frame) { + pr_info("txo tx status: %d tx_resp->failure_frame: %d\n", + idx, tx_resp->failure_frame); + } For a while, it showed the status 131 and failure_frame of 0, then it started reporting lots of smaller numbers of retries (more like what I would expect to see). And then it seemed to stick in a case where it was reporting lots of successes with 15 retries: Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 Jun 06 06:44:50 ct523c-0b29 kernel: txo tx status: 1 tx_resp->failure_frame: 15 I think it is quite unlikely that it reliably succeeded on the very last retry so many times in a row. Maybe the FW has some issues in reporting the failure_frame count properly? Thanks, Ben