Re: [PATCH V2] md/raid1: exit sync request if MD_RECOVERY_INTR is set

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Apr 09, 2018 at 09:50:44AM +0800, Yufen Yu wrote:
> We met a sync thread stuck as follows:
> 
>  raid1_sync_request+0x2c9/0xb50
>  md_do_sync+0x983/0xfa0
>  md_thread+0x11c/0x160
>  kthread+0x111/0x130
>  ret_from_fork+0x35/0x40
>  0xffffffffffffffff
> 
> At the same time, there is a stuck mdadm thread (mdadm --manage
> /dev/md2 --add /dev/sda). It is trying to stop the sync thread:
> 
>  kthread_stop+0x42/0xf0
>  md_unregister_thread+0x3a/0x70
>  md_reap_sync_thread+0x15/0x160
>  action_store+0x142/0x2a0
>  md_attr_store+0x6c/0xb0
>  kernfs_fop_write+0x102/0x180
>  __vfs_write+0x33/0x170
>  vfs_write+0xad/0x1a0
>  SyS_write+0x52/0xc0
>  do_syscall_64+0x6e/0x190
>  entry_SYSCALL_64_after_hwframe+0x3d/0xa2
> 
> Debug tools show that the sync thread is waiting in raise_barrier(),
> until raid1d() end all normal IO bios into bio_end_io_list(introduced
> in commit 55ce74d4bfe1). But, raid1d() cannot end these bios if
> MD_CHANGE_PENDING bit is set. It needs to get mddev->reconfig_mutex lock
> and then clear the bit in md_check_recovery().
> However, the lock is holding by mdadm in action_store().
> 
> Thus, there is a loop:
> mdadm waiting for sync thread to stop, sync thread waiting for
> raid1d() to end bios, raid1d() waiting for mdadm to release
> mddev->reconfig_mutex lock and then it can end bios.
> 
> Fix this by checking MD_RECOVERY_INTR while waiting in raise_barrier(),
> so that sync thread can exit while mdadm is stoping the sync thread.
> 
> Fixes: 55ce74d4bfe1 ("md/raid1: ensure device failure recorded before write request returns.")
> Signed-off-by: Jason Yan <yanaijie@xxxxxxxxxx>
> Signed-off-by: Yufen Yu <yuyufen@xxxxxxxxxx>
Applied, thanks!

> ---
>  drivers/md/raid1.c | 23 ++++++++++++++++++-----
>  1 file changed, 18 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c
> index fe872dc6712e..df486bd7ddf5 100644
> --- a/drivers/md/raid1.c
> +++ b/drivers/md/raid1.c
> @@ -854,7 +854,7 @@ static void flush_pending_writes(struct r1conf *conf)
>   *    there is no normal IO happeing.  It must arrange to call
>   *    lower_barrier when the particular background IO completes.
>   */
> -static void raise_barrier(struct r1conf *conf, sector_t sector_nr)
> +static sector_t raise_barrier(struct r1conf *conf, sector_t sector_nr)
>  {
>  	int idx = sector_to_idx(sector_nr);
>  
> @@ -885,13 +885,23 @@ static void raise_barrier(struct r1conf *conf, sector_t sector_nr)
>  	 *    max resync count which allowed on current I/O barrier bucket.
>  	 */
>  	wait_event_lock_irq(conf->wait_barrier,
> -			    !conf->array_frozen &&
> +			    (!conf->array_frozen &&
>  			     !atomic_read(&conf->nr_pending[idx]) &&
> -			     atomic_read(&conf->barrier[idx]) < RESYNC_DEPTH,
> +			     atomic_read(&conf->barrier[idx]) < RESYNC_DEPTH) ||
> +				test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery),
>  			    conf->resync_lock);
>  
> +	if (test_bit(MD_RECOVERY_INTR, &conf->mddev->recovery)) {
> +		atomic_dec(&conf->barrier[idx]);
> +		spin_unlock_irq(&conf->resync_lock);
> +		wake_up(&conf->wait_barrier);
> +		return -EINTR;
> +	}
> +
>  	atomic_inc(&conf->nr_sync_pending);
>  	spin_unlock_irq(&conf->resync_lock);
> +
> +	return 0;
>  }
>  
>  static void lower_barrier(struct r1conf *conf, sector_t sector_nr)
> @@ -2662,9 +2672,12 @@ static sector_t raid1_sync_request(struct mddev *mddev, sector_t sector_nr,
>  
>  	bitmap_cond_end_sync(mddev->bitmap, sector_nr,
>  		mddev_is_clustered(mddev) && (sector_nr + 2 * RESYNC_SECTORS > conf->cluster_sync_high));
> -	r1_bio = raid1_alloc_init_r1buf(conf);
>  
> -	raise_barrier(conf, sector_nr);
> +
> +	if (raise_barrier(conf, sector_nr))
> +		return 0;
> +
> +	r1_bio = raid1_alloc_init_r1buf(conf);
>  
>  	rcu_read_lock();
>  	/*
> -- 
> 2.13.6
> 
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux