Patch "workqueue: Fix spruious data race in __flush_work()" has been added to the 6.10-stable tree

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



This is a note to let you know that I've just added the patch titled

    workqueue: Fix spruious data race in __flush_work()

to the 6.10-stable tree which can be found at:
    http://www.kernel.org/git/?p=linux/kernel/git/stable/stable-queue.git;a=summary

The filename of the patch is:
     workqueue-fix-spruious-data-race-in-__flush_work.patch
and it can be found in the queue-6.10 subdirectory.

If you, or anyone else, feels it should not be added to the stable tree,
please let <stable@xxxxxxxxxxxxxxx> know about it.



commit aa9797bef1f118706e73b6719191b83e35e696e2
Author: Tejun Heo <tj@xxxxxxxxxx>
Date:   Mon Aug 5 09:37:25 2024 -1000

    workqueue: Fix spruious data race in __flush_work()
    
    [ Upstream commit 8bc35475ef1a23b0e224f3242eb11c76cab0ea88 ]
    
    When flushing a work item for cancellation, __flush_work() knows that it
    exclusively owns the work item through its PENDING bit. 134874e2eee9
    ("workqueue: Allow cancel_work_sync() and disable_work() from atomic
    contexts on BH work items") added a read of @work->data to determine whether
    to use busy wait for BH work items that are being canceled. While the read
    is safe when @from_cancel, @work->data was read before testing @from_cancel
    to simplify code structure:
    
            data = *work_data_bits(work);
            if (from_cancel &&
                !WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) {
    
    While the read data was never used if !@from_cancel, this could trigger
    KCSAN data race detection spuriously:
    
      ==================================================================
      BUG: KCSAN: data-race in __flush_work / __flush_work
    
      write to 0xffff8881223aa3e8 of 8 bytes by task 3998 on cpu 0:
       instrument_write include/linux/instrumented.h:41 [inline]
       ___set_bit include/asm-generic/bitops/instrumented-non-atomic.h:28 [inline]
       insert_wq_barrier kernel/workqueue.c:3790 [inline]
       start_flush_work kernel/workqueue.c:4142 [inline]
       __flush_work+0x30b/0x570 kernel/workqueue.c:4178
       flush_work kernel/workqueue.c:4229 [inline]
       ...
    
      read to 0xffff8881223aa3e8 of 8 bytes by task 50 on cpu 1:
       __flush_work+0x42a/0x570 kernel/workqueue.c:4188
       flush_work kernel/workqueue.c:4229 [inline]
       flush_delayed_work+0x66/0x70 kernel/workqueue.c:4251
       ...
    
      value changed: 0x0000000000400000 -> 0xffff88810006c00d
    
    Reorganize the code so that @from_cancel is tested before @work->data is
    accessed. The only problem is triggering KCSAN detection spuriously. This
    shouldn't need READ_ONCE() or other access qualifiers.
    
    No functional changes.
    
    Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
    Reported-by: syzbot+b3e4f2f51ed645fd5df2@xxxxxxxxxxxxxxxxxxxxxxxxx
    Fixes: 134874e2eee9 ("workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts on BH work items")
    Link: http://lkml.kernel.org/r/000000000000ae429e061eea2157@xxxxxxxxxx
    Cc: Jens Axboe <axboe@xxxxxxxxx>
    Signed-off-by: Sasha Levin <sashal@xxxxxxxxxx>

diff --git a/kernel/workqueue.c b/kernel/workqueue.c
index c8687c0ab3645..c970eec25c5a0 100644
--- a/kernel/workqueue.c
+++ b/kernel/workqueue.c
@@ -4190,7 +4190,6 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
 static bool __flush_work(struct work_struct *work, bool from_cancel)
 {
 	struct wq_barrier barr;
-	unsigned long data;
 
 	if (WARN_ON(!wq_online))
 		return false;
@@ -4208,29 +4207,35 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
 	 * was queued on a BH workqueue, we also know that it was running in the
 	 * BH context and thus can be busy-waited.
 	 */
-	data = *work_data_bits(work);
-	if (from_cancel &&
-	    !WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) {
-		/*
-		 * On RT, prevent a live lock when %current preempted soft
-		 * interrupt processing or prevents ksoftirqd from running by
-		 * keeping flipping BH. If the BH work item runs on a different
-		 * CPU then this has no effect other than doing the BH
-		 * disable/enable dance for nothing. This is copied from
-		 * kernel/softirq.c::tasklet_unlock_spin_wait().
-		 */
-		while (!try_wait_for_completion(&barr.done)) {
-			if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
-				local_bh_disable();
-				local_bh_enable();
-			} else {
-				cpu_relax();
+	if (from_cancel) {
+		unsigned long data = *work_data_bits(work);
+
+		if (!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) &&
+		    (data & WORK_OFFQ_BH)) {
+			/*
+			 * On RT, prevent a live lock when %current preempted
+			 * soft interrupt processing or prevents ksoftirqd from
+			 * running by keeping flipping BH. If the BH work item
+			 * runs on a different CPU then this has no effect other
+			 * than doing the BH disable/enable dance for nothing.
+			 * This is copied from
+			 * kernel/softirq.c::tasklet_unlock_spin_wait().
+			 */
+			while (!try_wait_for_completion(&barr.done)) {
+				if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
+					local_bh_disable();
+					local_bh_enable();
+				} else {
+					cpu_relax();
+				}
 			}
+			goto out_destroy;
 		}
-	} else {
-		wait_for_completion(&barr.done);
 	}
 
+	wait_for_completion(&barr.done);
+
+out_destroy:
 	destroy_work_on_stack(&barr.work);
 	return true;
 }




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux