This is a note to let you know that I've just added the patch titled i40e: fix: remove needless retries of NVM update to the 6.6-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: i40e-fix-remove-needless-retries-of-nvm-update.patch and it can be found in the queue-6.6 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. commit bd4a1d0a8445ff8bdc4d82484e540af8abbf0ea5 Author: Aleksandr Loktionov <aleksandr.loktionov@xxxxxxxxx> Date: Wed Jul 10 15:44:54 2024 -0700 i40e: fix: remove needless retries of NVM update [ Upstream commit 8b9b59e27aa88ba133fbac85def3f8be67f2d5a8 ] Remove wrong EIO to EGAIN conversion and pass all errors as is. After commit 230f3d53a547 ("i40e: remove i40e_status"), which should only replace F/W specific error codes with Linux kernel generic, all EIO errors suddenly started to be converted into EAGAIN which leads nvmupdate to retry until it timeouts and sometimes fails after more than 20 minutes in the middle of NVM update, so NVM becomes corrupted. The bug affects users only at the time when they try to update NVM, and only F/W versions that generate errors while nvmupdate. For example, X710DA2 with 0x8000ECB7 F/W is affected, but there are probably more... Command for reproduction is just NVM update: ./nvmupdate64 In the log instead of: i40e_nvmupd_exec_aq err I40E_ERR_ADMIN_QUEUE_ERROR aq_err I40E_AQ_RC_ENOMEM) appears: i40e_nvmupd_exec_aq err -EIO aq_err I40E_AQ_RC_ENOMEM i40e: eeprom check failed (-5), Tx/Rx traffic disabled The problematic code did silently convert EIO into EAGAIN which forced nvmupdate to ignore EAGAIN error and retry the same operation until timeout. That's why NVM update takes 20+ minutes to finish with the fail in the end. Fixes: 230f3d53a547 ("i40e: remove i40e_status") Co-developed-by: Kelvin Kang <kelvin.kang@xxxxxxxxx> Signed-off-by: Kelvin Kang <kelvin.kang@xxxxxxxxx> Reviewed-by: Arkadiusz Kubalewski <arkadiusz.kubalewski@xxxxxxxxx> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@xxxxxxxxx> Reviewed-by: Przemek Kitszel <przemyslaw.kitszel@xxxxxxxxx> Tested-by: Tony Brelinski <tony.brelinski@xxxxxxxxx> Signed-off-by: Tony Nguyen <anthony.l.nguyen@xxxxxxxxx> Reviewed-by: Jacob Keller <jacob.e.keller@xxxxxxxxx> Link: https://patch.msgid.link/20240710224455.188502-1-anthony.l.nguyen@xxxxxxxxx Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx> Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx> diff --git a/drivers/net/ethernet/intel/i40e/i40e_adminq.h b/drivers/net/ethernet/intel/i40e/i40e_adminq.h index 80125bea80a2a..290c23cec2fca 100644 --- a/drivers/net/ethernet/intel/i40e/i40e_adminq.h +++ b/drivers/net/ethernet/intel/i40e/i40e_adminq.h @@ -116,10 +116,6 @@ static inline int i40e_aq_rc_to_posix(int aq_ret, int aq_rc) -EFBIG, /* I40E_AQ_RC_EFBIG */ }; - /* aq_rc is invalid if AQ timed out */ - if (aq_ret == -EIO) - return -EAGAIN; - if (!((u32)aq_rc < (sizeof(aq_to_posix) / sizeof((aq_to_posix)[0])))) return -ERANGE;