On Fri, Sep 07, 2018 at 11:30:34AM +0800, alexwu wrote: > From: Alex Wu <alexwu@xxxxxxxxxxxx> > > In raid10_sync_request(), we expect issue a bio with callback > end_sync_read(), and a bio with callback end_sync_write(). > > In normal case, we will add resync sectors into mddev->recovery_active > when raid10_sync_request() returned, and sub resync sectors from > mddev->recovery_active when end_sync_write() call end_sync_request(). > > If new added disk, which are replacing the old disk, is set faulty, > there is a race issue: > 1. In the first rcu protected section, resync thread did not detect > that mreplace is set faulty and pass the condition. > 2. In the second rcu protected section, mreplace is set faulty. > 3. But, resync thread will prepare the read object first, and then > check the write condition. > 4. It will find that mreplace is set faulty and do not have to > prepare write object. > This cause we add resync sectors but never sub. > > This issue can be easily reproduced by the following steps: > mdadm -C /dev/md0 --assume-clean -l 10 -n 4 /dev/sd[abcd] > mdadm /dev/md0 -a /dev/sde > mdadm /dev/md0 --replace /dev/sdd > sleep 1 > mdadm /dev/md0 -f /dev/sde > > This issue can be fixed by checking the write condition before prepare > the read object in the second rcu protected section. > > Reported-by: Alex Chen <alexchen@xxxxxxxxxxxx> > Reviewed-by: Allen Peng <allenpeng@xxxxxxxxxxxx> > Reviewed-by: BingJing Chang <bingjingc@xxxxxxxxxxxx> > Signed-off-by: Alex Wu <alexwu@xxxxxxxxxxxx> > --- > drivers/md/raid10.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/drivers/md/raid10.c b/drivers/md/raid10.c > index 9818980..593d193 100644 > --- a/drivers/md/raid10.c > +++ b/drivers/md/raid10.c > @@ -3189,6 +3189,17 @@ static sector_t raid10_sync_request(struct mddev *mddev, sector_t sector_nr, > continue; > } > } > + > + /* make sure we need to write > + * before we prepare read object > + */ > + if (test_bit(In_sync, &mrdev->flags) && > + (mreplace == NULL || > + r10_bio->devs[1].repl_bio == NULL || > + test_bit(Faulty, &mreplace->flags))) { > + break; > + } > + Does this fix the problem completely? Faulty can be set at any time, for example, after this check, then we will have a mismatch again. How about we record the Faulty bit for mpreplace locally, then use it for both read/write check. Thanks, Shaohua > bio = r10_bio->devs[0].bio; > bio->bi_next = biolist; > biolist = bio; > -- > 2.7.4 >