Re: [PATCH] md/raid1: exit sync request if MD_RECOVERY_INTR is set

yuyufen <yuyufen@xxxxxxxxxx> · Mon, 9 Apr 2018 09:26:42 +0800

On 2018/4/2 7:09, Shaohua Li wrote:
On Mon, Mar 26, 2018 at 12:13:19PM +0800, Yufen Yu wrote:
We met a sync thread stuck as follows:

  raid1_sync_request+0x2c9/0xb50
  md_do_sync+0x983/0xfa0
  md_thread+0x11c/0x160
  kthread+0x111/0x130
  ret_from_fork+0x35/0x40
  0xffffffffffffffff

The mdadm was trying to reap the sync thread:

  kthread_stop+0x42/0xf0
  md_unregister_thread+0x3a/0x70
  md_reap_sync_thread+0x15/0x160
  action_store+0x142/0x2a0
  md_attr_store+0x6c/0xb0
  kernfs_fop_write+0x102/0x180
  __vfs_write+0x33/0x170
  vfs_write+0xad/0x1a0
  SyS_write+0x52/0xc0
  do_syscall_64+0x6e/0x190
  entry_SYSCALL_64_after_hwframe+0x3d/0xa2

Commit 55ce74d4bfe1 introduced a bio_end_io_list to store bios and won't
end these bios if MD_CHANGE_PENDING is set. In that case the sync thread
will wait these bios done. But action_store() is holding mddev lock
which makes that the MD_CHANGE_PENDING will not be cleared. So these
threads all got stucked.

Fix this by checking MD_RECOVERY_INTR while waiting in raise_barrier()
so that sync thread can exit while mdadm is stoping the sync thread.

Fixes: 55ce74d4bfe1 ("md/raid1: ensure device failure recorded before write request returns.")
Reviewed-by: Wei Fang <fangwei1@xxxxxxxxxx>
Reviewed-by: Miao Xie <miaoxie@xxxxxxxxxx>
Signed-off-by: Jason Yan <yanaijie@xxxxxxxxxx>
Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx>
---
  drivers/md/raid1.c | 22 +++++++++++++++++-----
  1 file changed, 17 insertions(+), 5 deletions(-)

diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
index fe872dc6712e..af12d1e6cfc6 100644
--- a/drivers/md/raid1.c
+++ b/drivers/md/raid1.c
@@ -854,7 +854,7 @@ static void flush_pending_writes(struct r1conf *conf)
   *    there is no normal IO happeing.  It must arrange to call
   *    lower_barrier when the particular background IO completes.
   */
-static void raise_barrier(struct r1conf *conf, sector_t sector_nr)
+static int raise_barrier(struct r1conf *conf, sector_t sector_nr)
  {
  	int idx = sector_to_idx(sector_nr);
  
@@ -885,13 +885,22 @@ static void raise_barrier(struct r1conf *conf, sector_t sector_nr)
  	 *    max resync count which allowed on current I/O barrier bucket.
  	 */
  	wait_event_lock_irq(conf->wait_barrier,
-			    !conf->array_frozen &&
+			    (!conf->array_frozen &&
  			     !atomic_read(&conf->nr_pending[idx]) &&
-			     atomic_read(&conf->barrier[idx]) < RESYNC_DEPTH,
+			     atomic_read(&conf->barrier[idx]) < RESYNC_DEPTH) ||
+				test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery),
  			    conf->resync_lock);
  
+	if (test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) {
+		atomic_dec(&conf->barrier[idx]);
+		spin_unlock_irq(&conf->resync_lock);
I'd better call wake_up(&conf->wait_barrier); here to be safe.

You are right. I will resend V2.

Thanks,
Yufen

+		return -EINTR;
+	}
+
  	atomic_inc(&conf->nr_sync_pending);
  	spin_unlock_irq(&conf->resync_lock);
+
+	return 0;
  }
  
  static void lower_barrier(struct r1conf *conf, sector_t sector_nr)
@@ -2662,9 +2671,12 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
  
  	bitmap_cond_end_sync(mddev->bitmap, sector_nr,
  		mddev_is_clustered(mddev) && (sector_nr + 2 * RESYNC_SECTORS > conf->cluster_sync_high));
-	r1_bio = raid1_alloc_init_r1buf(conf);
  
-	raise_barrier(conf, sector_nr);
+
+	if (raise_barrier(conf, sector_nr))
+		return 0;
+
+	r1_bio = raid1_alloc_init_r1buf(conf);
  
  	rcu_read_lock();
  	/*
--
2.13.6

--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html
.



--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html