Patch "crypto: qat - resolve race condition during AER recovery" has been added to the 5.4-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    crypto: qat - resolve race condition during AER recovery

to the 5.4-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     crypto-qat-resolve-race-condition-during-aer-recover.patch
and it can be found in the queue-5.4 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit 5451aa18ecfed58331298d88f6a6944d090a9fa4
Author: Damian Muszynski <damian.muszynski@xxxxxxxxx>
Date:   Fri Feb 9 13:43:42 2024 +0100

    crypto: qat - resolve race condition during AER recovery
    
    [ Upstream commit 7d42e097607c4d246d99225bf2b195b6167a210c ]
    
    During the PCI AER system's error recovery process, the kernel driver
    may encounter a race condition with freeing the reset_data structure's
    memory. If the device restart will take more than 10 seconds the function
    scheduling that restart will exit due to a timeout, and the reset_data
    structure will be freed. However, this data structure is used for
    completion notification after the restart is completed, which leads
    to a UAF bug.
    
    This results in a KFENCE bug notice.
    
      BUG: KFENCE: use-after-free read in adf_device_reset_worker+0x38/0xa0 [intel_qat]
      Use-after-free read at 0x00000000bc56fddf (in kfence-#142):
      adf_device_reset_worker+0x38/0xa0 [intel_qat]
      process_one_work+0x173/0x340
    
    To resolve this race condition, the memory associated to the container
    of the work_struct is freed on the worker if the timeout expired,
    otherwise on the function that schedules the worker.
    The timeout detection can be done by checking if the caller is
    still waiting for completion or not by using completion_done() function.
    
    Fixes: d8cba25d2c68 ("crypto: qat - Intel(R) QAT driver framework")
    Cc: <stable@xxxxxxxxxxxxxxx>
    Signed-off-by: Damian Muszynski <damian.muszynski@xxxxxxxxx>
    Reviewed-by: Giovanni Cabiddu <giovanni.cabiddu@xxxxxxxxx>
    Signed-off-by: Herbert Xu <herbert@xxxxxxxxxxxxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/drivers/crypto/qat/qat_common/adf_aer.c b/drivers/crypto/qat/qat_common/adf_aer.c
index 20f983b830065..a2989f0188cad 100644
--- a/drivers/crypto/qat/qat_common/adf_aer.c
+++ b/drivers/crypto/qat/qat_common/adf_aer.c
@@ -139,7 +139,8 @@ static void adf_device_reset_worker(struct work_struct *work)
 	if (adf_dev_init(accel_dev) || adf_dev_start(accel_dev)) {
 		/* The device hanged and we can't restart it so stop here */
 		dev_err(&GET_DEV(accel_dev), "Restart device failed\n");
-		if (reset_data->mode == ADF_DEV_RESET_ASYNC)
+		if (reset_data->mode == ADF_DEV_RESET_ASYNC ||
+		    completion_done(&reset_data->compl))
 			kfree(reset_data);
 		WARN(1, "QAT: device restart failed. Device is unusable\n");
 		return;
@@ -147,11 +148,19 @@ static void adf_device_reset_worker(struct work_struct *work)
 	adf_dev_restarted_notify(accel_dev);
 	clear_bit(ADF_STATUS_RESTARTING, &accel_dev->status);
 
-	/* The dev is back alive. Notify the caller if in sync mode */
-	if (reset_data->mode == ADF_DEV_RESET_SYNC)
-		complete(&reset_data->compl);
-	else
+	/*
+	 * The dev is back alive. Notify the caller if in sync mode
+	 *
+	 * If device restart will take a more time than expected,
+	 * the schedule_reset() function can timeout and exit. This can be
+	 * detected by calling the completion_done() function. In this case
+	 * the reset_data structure needs to be freed here.
+	 */
+	if (reset_data->mode == ADF_DEV_RESET_ASYNC ||
+	    completion_done(&reset_data->compl))
 		kfree(reset_data);
+	else
+		complete(&reset_data->compl);
 }
 
 static int adf_dev_aer_schedule_reset(struct adf_accel_dev *accel_dev,
@@ -184,8 +193,9 @@ static int adf_dev_aer_schedule_reset(struct adf_accel_dev *accel_dev,
 			dev_err(&GET_DEV(accel_dev),
 				"Reset device timeout expired\n");
 			ret = -EFAULT;
+		} else {
+			kfree(reset_data);
 		}
-		kfree(reset_data);
 		return ret;
 	}
 	return 0;




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux