Re: [PATCH 1/7] md: Revert fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration")

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

在 2024/01/18 2:17, Mikulas Patocka 写道:
The commit fa2bbff7b0b4 breaks the LVM2 test shell/integrity-caching.sh,
so let's revert it.

sysrq: Show Blocked State
task:lvm             state:D stack:0     pid:8275  tgid:8275  ppid:1373   flags:0x00000002
Call Trace:
  <TASK>
  __schedule+0x228/0x570
  ? __percpu_ref_switch_mode+0xb7/0x1b0
  schedule+0x29/0xa0
  mddev_suspend+0xec/0x1a0 [md_mod]

We really need more information about the root cause here. If
mddev_suspend() is waiting for this flush IO to be done, then why
the flush IO can't finish?

Thanks,
Kuai

  ? housekeeping_test_cpu+0x30/0x30
  dm_table_postsuspend_targets+0x34/0x50 [dm_mod]
  __dm_destroy+0x1c5/0x1e0 [dm_mod]
  ? table_clear+0xa0/0xa0 [dm_mod]
  dev_remove+0xd4/0x110 [dm_mod]
  ctl_ioctl+0x2e1/0x570 [dm_mod]
  dm_ctl_ioctl+0x5/0x10 [dm_mod]
  __x64_sys_ioctl+0x85/0xa0
  do_syscall_64+0x5d/0x1a0
  entry_SYSCALL_64_after_hwframe+0x46/0x4e

Signed-off-by: Mikulas Patocka <mpatocka@xxxxxxxxxx>
Fixes: fa2bbff7b0b4 ("md: synchronize flush io with array reconfiguration")

---
  drivers/md/md.c |   21 ++++++---------------
  1 file changed, 6 insertions(+), 15 deletions(-)

Index: linux-2.6/drivers/md/md.c
===================================================================
--- linux-2.6.orig/drivers/md/md.c
+++ linux-2.6/drivers/md/md.c
@@ -543,9 +543,6 @@ static void md_end_flush(struct bio *bio
  	rdev_dec_pending(rdev, mddev);
if (atomic_dec_and_test(&mddev->flush_pending)) {
-		/* The pair is percpu_ref_get() from md_flush_request() */
-		percpu_ref_put(&mddev->active_io);
-
  		/* The pre-request flush has finished */
  		queue_work(md_wq, &mddev->flush_work);
  	}
@@ -565,7 +562,12 @@ static void submit_flushes(struct work_s
  	rdev_for_each_rcu(rdev, mddev)
  		if (rdev->raid_disk >= 0 &&
  		    !test_bit(Faulty, &rdev->flags)) {
+			/* Take two references, one is dropped
+			 * when request finishes, one after
+			 * we reclaim rcu_read_lock
+			 */
  			struct bio *bi;
+			atomic_inc(&rdev->nr_pending);
atomic_inc(&rdev->nr_pending);
  			rcu_read_unlock();
@@ -577,6 +579,7 @@ static void submit_flushes(struct work_s
  			atomic_inc(&mddev->flush_pending);
  			submit_bio(bi);
  			rcu_read_lock();
+			rdev_dec_pending(rdev, mddev);
  		}
  	rcu_read_unlock();
  	if (atomic_dec_and_test(&mddev->flush_pending))
@@ -629,18 +632,6 @@ bool md_flush_request(struct mddev *mdde
  	/* new request after previous flush is completed */
  	if (ktime_after(req_start, mddev->prev_flush_start)) {
  		WARN_ON(mddev->flush_bio);
-		/*
-		 * Grab a reference to make sure mddev_suspend() will wait for
-		 * this flush to be done.
-		 *
-		 * md_flush_reqeust() is called under md_handle_request() and
-		 * 'active_io' is already grabbed, hence percpu_ref_is_zero()
-		 * won't pass, percpu_ref_tryget_live() can't be used because
-		 * percpu_ref_kill() can be called by mddev_suspend()
-		 * concurrently.
-		 */
-		WARN_ON(percpu_ref_is_zero(&mddev->active_io));
-		percpu_ref_get(&mddev->active_io);
  		mddev->flush_bio = bio;
  		bio = NULL;
  	}

.






[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux