This is a note to let you know that I've just added the patch titled net: ena: fix rare uncompleted admin command false alarm to the 4.9-stable tree which can be found at: http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary The filename of the patch is: net-ena-fix-rare-uncompleted-admin-command-false-alarm.patch and it can be found in the queue-4.9 subdirectory. If you, or anyone else, feels it should not be added to the stable tree, please let <stable@xxxxxxxxxxxxxxx> know about it. >From foo@baz Mon Apr 9 17:09:24 CEST 2018 From: Netanel Belgazal <netanel@xxxxxxxxxx> Date: Sun, 11 Jun 2017 15:42:43 +0300 Subject: net: ena: fix rare uncompleted admin command false alarm From: Netanel Belgazal <netanel@xxxxxxxxxx> [ Upstream commit a77c1aafcc906f657d1a0890c1d898be9ee1d5c9 ] The current flow to detect admin completion is: while (command_not_completed) { if (timeout) error check_for_completion() sleep() } So in case the sleep took more than the timeout (in case the thread/workqueue was not scheduled due to higher priority task or prolonged VMexit), the driver can detect a stall even if the completion is present. The fix changes the order of this function to first check for completion and only after that check if the timeout expired. Fixes: 1738cd3ed342 ("Add a driver for Amazon Elastic Network Adapters (ENA)") Signed-off-by: Netanel Belgazal <netanel@xxxxxxxxxx> Signed-off-by: David S. Miller <davem@xxxxxxxxxxxxx> Signed-off-by: Sasha Levin <alexander.levin@xxxxxxxxxxxxx> Signed-off-by: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx> --- drivers/net/ethernet/amazon/ena/ena_com.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) --- a/drivers/net/ethernet/amazon/ena/ena_com.c +++ b/drivers/net/ethernet/amazon/ena/ena_com.c @@ -508,15 +508,20 @@ static int ena_com_comp_status_to_errno( static int ena_com_wait_and_process_admin_cq_polling(struct ena_comp_ctx *comp_ctx, struct ena_com_admin_queue *admin_queue) { - unsigned long flags; - u32 start_time; + unsigned long flags, timeout; int ret; - start_time = ((u32)jiffies_to_usecs(jiffies)); + timeout = jiffies + ADMIN_CMD_TIMEOUT_US; - while (comp_ctx->status == ENA_CMD_SUBMITTED) { - if ((((u32)jiffies_to_usecs(jiffies)) - start_time) > - ADMIN_CMD_TIMEOUT_US) { + while (1) { + spin_lock_irqsave(&admin_queue->q_lock, flags); + ena_com_handle_admin_completion(admin_queue); + spin_unlock_irqrestore(&admin_queue->q_lock, flags); + + if (comp_ctx->status != ENA_CMD_SUBMITTED) + break; + + if (time_is_before_jiffies(timeout)) { pr_err("Wait for completion (polling) timeout\n"); /* ENA didn't have any completion */ spin_lock_irqsave(&admin_queue->q_lock, flags); @@ -528,10 +533,6 @@ static int ena_com_wait_and_process_admi goto err; } - spin_lock_irqsave(&admin_queue->q_lock, flags); - ena_com_handle_admin_completion(admin_queue); - spin_unlock_irqrestore(&admin_queue->q_lock, flags); - msleep(100); } Patches currently in stable-queue which might be from netanel@xxxxxxxxxx are queue-4.9/net-ena-disable-admin-msix-while-working-in-polling-mode.patch queue-4.9/net-ena-fix-race-condition-between-submit-and-completion-admin-command.patch queue-4.9/net-ena-fix-rare-uncompleted-admin-command-false-alarm.patch queue-4.9/net-ena-add-missing-unmap-bars-on-device-removal.patch queue-4.9/net-ena-add-missing-return-when-ena_com_get_io_handlers-fails.patch