Re: raid5 reshape is stuck when raid5 journal device miss

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Aug 24, 2018 at 03:53:59AM -0400, Xiao Ni wrote:
> Hi all
> 
> The reshape can be stuck during raid5 reshape when raid5 journal misses. It can
> be reproduced 100%
> 
> The test steps are:
> 1. mdadm -CR /dev/md0 -l5 -n4 /dev/sd[b-e]1 --write-journal /dev/sdf1
> 2. mdadm --wait /dev/md0
> 3. mdadm /dev/md0 -f /dev/sdf1
> 4. mdadm /dev/md0 -r /dev/sdf1
> 5. mdadm /dev/md0 -a /dev/sdf1
> 6. mdadm -G -n5 /dev/md0
> 
> Reshape request has 4 steps:
> 1. read data for source stripes
> 2. write source strips data to target stripes
> 3. calculate parity for target stripes
> 4. write target stripes to disks. 
> 
> After step3:
> sh->reconstruct_state is reconstruct_state_result
> sh->state is STRIPE_EXPANDING | STRIPE_EXPAND_READY
> 
> Now it needs to write data to disks. And it needs to execute this part code:
> 
>         /* Finish reconstruct operations initiated by the expansion process */
>         if (sh->reconstruct_state == reconstruct_state_result) {
> 
> But the journal disk is removed, it execute this part code:
> 
>         if (s.failed > conf->max_degraded ||
>             (s.log_failed && s.injournal == 0)) {
>                 sh->check_state = 0;
>                 sh->reconstruct_state = 0;
> 
> After setting sh->reconstruct_state to zero, it will go to calculate the parity again.
> Now it's stuck in a dead loop. 
> 
> Can we allow the reshape happen in this case? Is it ok just to return failure for command
> `mdadm -G -n5 /dev/md0` in this case?

We actually don't support reshape with log enabled yet. How about this one:


diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index a001808a2b77..bfb811407061 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -46,6 +46,11 @@ extern int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
 extern void ppl_quiesce(struct r5conf *conf, int quiesce);
 extern int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
 
+static inline bool raid5_has_log(struct r5conf *conf)
+{
+	return test_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
+}
+
 static inline bool raid5_has_ppl(struct r5conf *conf)
 {
 	return test_bit(MD_HAS_PPL, &conf->mddev->flags);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 4ce0d7502fad..e4e98f47865d 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -733,7 +733,7 @@ static bool stripe_can_batch(struct stripe_head *sh)
 {
 	struct r5conf *conf = sh->raid_conf;
 
-	if (conf->log || raid5_has_ppl(conf))
+	if (raid5_has_log(conf) || raid5_has_ppl(conf))
 		return false;
 	return test_bit(STRIPE_BATCH_READY, &sh->state) &&
 		!test_bit(STRIPE_BITMAP_PENDING, &sh->state) &&
@@ -7737,7 +7737,7 @@ static int raid5_resize(struct mddev *mddev, sector_t sectors)
 	sector_t newsize;
 	struct r5conf *conf = mddev->private;
 
-	if (conf->log || raid5_has_ppl(conf))
+	if (raid5_has_log(conf) || raid5_has_ppl(conf))
 		return -EINVAL;
 	sectors &= ~((sector_t)conf->chunk_sectors - 1);
 	newsize = raid5_size(mddev, sectors, mddev->raid_disks);
@@ -7788,7 +7788,7 @@ static int check_reshape(struct mddev *mddev)
 {
 	struct r5conf *conf = mddev->private;
 
-	if (conf->log || raid5_has_ppl(conf))
+	if (raid5_has_log(conf) || raid5_has_ppl(conf))
 		return -EINVAL;
 	if (mddev->delta_disks == 0 &&
 	    mddev->new_layout == mddev->layout &&



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux