raid5 reshape is stuck when raid5 journal device miss

Xiao Ni <xni@xxxxxxxxxx> · Fri, 24 Aug 2018 03:53:59 -0400 (EDT)

Hi all

The reshape can be stuck during raid5 reshape when raid5 journal misses. It can
be reproduced 100%

The test steps are:
1. mdadm -CR /dev/md0 -l5 -n4 /dev/sd[b-e]1 --write-journal /dev/sdf1
2. mdadm --wait /dev/md0
3. mdadm /dev/md0 -f /dev/sdf1
4. mdadm /dev/md0 -r /dev/sdf1
5. mdadm /dev/md0 -a /dev/sdf1
6. mdadm -G -n5 /dev/md0

Reshape request has 4 steps:
1. read data for source stripes
2. write source strips data to target stripes
3. calculate parity for target stripes
4. write target stripes to disks. 

After step3:
sh->reconstruct_state is reconstruct_state_result
sh->state is STRIPE_EXPANDING | STRIPE_EXPAND_READY

Now it needs to write data to disks. And it needs to execute this part code:

        /* Finish reconstruct operations initiated by the expansion process */
        if (sh->reconstruct_state == reconstruct_state_result) {

But the journal disk is removed, it execute this part code:

        if (s.failed > conf->max_degraded ||
            (s.log_failed && s.injournal == 0)) {
                sh->check_state = 0;
                sh->reconstruct_state = 0;

After setting sh->reconstruct_state to zero, it will go to calculate the parity again.
Now it's stuck in a dead loop. 

Can we allow the reshape happen in this case? Is it ok just to return failure for command
`mdadm -G -n5 /dev/md0` in this case?

Best Regards
Xiao