On 08/30/2018 02:02 AM, Shaohua Li wrote:
On Fri, Aug 24, 2018 at 03:53:59AM -0400, Xiao Ni wrote:
Hi all
The reshape can be stuck during raid5 reshape when raid5 journal misses. It can
be reproduced 100%
The test steps are:
1. mdadm -CR /dev/md0 -l5 -n4 /dev/sd[b-e]1 --write-journal /dev/sdf1
2. mdadm --wait /dev/md0
3. mdadm /dev/md0 -f /dev/sdf1
4. mdadm /dev/md0 -r /dev/sdf1
5. mdadm /dev/md0 -a /dev/sdf1
6. mdadm -G -n5 /dev/md0
Reshape request has 4 steps:
1. read data for source stripes
2. write source strips data to target stripes
3. calculate parity for target stripes
4. write target stripes to disks.
After step3:
sh->reconstruct_state is reconstruct_state_result
sh->state is STRIPE_EXPANDING | STRIPE_EXPAND_READY
Now it needs to write data to disks. And it needs to execute this part code:
/* Finish reconstruct operations initiated by the expansion process */
if (sh->reconstruct_state == reconstruct_state_result) {
But the journal disk is removed, it execute this part code:
if (s.failed > conf->max_degraded ||
(s.log_failed && s.injournal == 0)) {
sh->check_state = 0;
sh->reconstruct_state = 0;
After setting sh->reconstruct_state to zero, it will go to calculate the parity again.
Now it's stuck in a dead loop.
Can we allow the reshape happen in this case? Is it ok just to return failure for command
`mdadm -G -n5 /dev/md0` in this case?
We actually don't support reshape with log enabled yet. How about this one:
diff --git a/drivers/md/raid5-log.h b/drivers/md/raid5-log.h
index a001808a2b77..bfb811407061 100644
--- a/drivers/md/raid5-log.h
+++ b/drivers/md/raid5-log.h
@@ -46,6 +46,11 @@ extern int ppl_modify_log(struct r5conf *conf, struct md_rdev *rdev, bool add);
extern void ppl_quiesce(struct r5conf *conf, int quiesce);
extern int ppl_handle_flush_request(struct r5l_log *log, struct bio *bio);
+static inline bool raid5_has_log(struct r5conf *conf)
+{
+ return test_bit(MD_HAS_JOURNAL, &conf->mddev->flags);
+}
+
static inline bool raid5_has_ppl(struct r5conf *conf)
{
return test_bit(MD_HAS_PPL, &conf->mddev->flags);
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 4ce0d7502fad..e4e98f47865d 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -733,7 +733,7 @@ static bool stripe_can_batch(struct stripe_head *sh)
{
struct r5conf *conf = sh->raid_conf;
- if (conf->log || raid5_has_ppl(conf))
+ if (raid5_has_log(conf) || raid5_has_ppl(conf))
return false;
return test_bit(STRIPE_BATCH_READY, &sh->state) &&
!test_bit(STRIPE_BITMAP_PENDING, &sh->state) &&
@@ -7737,7 +7737,7 @@ static int raid5_resize(struct mddev *mddev, sector_t sectors)
sector_t newsize;
struct r5conf *conf = mddev->private;
- if (conf->log || raid5_has_ppl(conf))
+ if (raid5_has_log(conf) || raid5_has_ppl(conf))
return -EINVAL;
sectors &= ~((sector_t)conf->chunk_sectors - 1);
newsize = raid5_size(mddev, sectors, mddev->raid_disks);
@@ -7788,7 +7788,7 @@ static int check_reshape(struct mddev *mddev)
{
struct r5conf *conf = mddev->private;
- if (conf->log || raid5_has_ppl(conf))
+ if (raid5_has_log(conf) || raid5_has_ppl(conf))
return -EINVAL;
if (mddev->delta_disks == 0 &&
mddev->new_layout == mddev->layout &&
Hi Shaohua
The patch can fix this problem. Thanks for your time.
Best Regards
Xiao