> On Aug 29, 2019, at 2:05 AM, Guoqing Jiang <guoqing.jiang@xxxxxxxxxxxxxxx> wrote: > > > > On 8/29/19 7:42 AM, Song Liu wrote: >>> drivers/md/raid5.c | 3 +-- >>> 1 file changed, 1 insertion(+), 2 deletions(-) >>> >>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c >>> index 88e56ee98976..e3dced8ad1b5 100644 >>> --- a/drivers/md/raid5.c >>> +++ b/drivers/md/raid5.c >>> @@ -4612,8 +4612,7 @@ static void break_stripe_batch_list(struct stripe_head *head_sh, >>> >>> list_del_init(&sh->batch_list); >>> >>> - WARN_ONCE(sh->state & ((1 << STRIPE_ACTIVE) | >>> - (1 << STRIPE_SYNCING) | >>> + WARN_ONCE(sh->state & ((1 << STRIPE_SYNCING) | >>> (1 << STRIPE_REPLACED) | >>> (1 << STRIPE_DELAYED) | >>> (1 << STRIPE_BIT_DELAY) | >> I read the code again, and now I am not sure whether we are fixing the issue. >> This WARN_ONCE() does not run for head_sh, which should have STRIPE_ACTIVE. >> It only runs on other stripes in the batch, which should not have STRIPE_ACTIVE. >> >> Does this make sense? > > Yes, maybe some stripes are added to batch_list with STRIPE_ACTIVE is set. > could it possible caused by the path? > > retry_aligned_read -> sh = raid5_get_active_stripe(conf, sector, 0, 1, 1) > ... > if (!add_stripe_bio(sh, raid_bio, dd_idx, 0, 0)) > ... > handle_stripe(sh) I think retry_aligned_read() will not add the sh to any batch list, because it should not be a full stripe write: if (stripe_can_batch(sh)) stripe_add_to_batch_list(conf, sh); But I am not sure. > > The sh is added to batch_list by add_stripe_bio -> stripe_add_to_batch_list, > then the state is set to STRIPE_ACTIVE at the beginning of handle_stripe. > > Maybe introduce a flag to show that the stripe is in batch_list, and don't call > handle_stripe for that stripe, some rough changes like. > > gjiang@ls00508:/media/gjiang/opensource-tree/md$ git diff > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > index e3dced8ad1b5..c04d5b55848a 100644 > --- a/drivers/md/raid5.c > +++ b/drivers/md/raid5.c > @@ -817,12 +817,14 @@ static void stripe_add_to_batch_list(struct r5conf *conf, struct stripe_head *sh > * can still add the stripe to batch list > */ > list_add(&sh->batch_list, &head->batch_list); > + set_bit(STRIPE_IN_BATCH_LIST, &sh->state); > spin_unlock(&head->batch_head->batch_lock); > } else { > head->batch_head = head; > sh->batch_head = head->batch_head; > spin_lock(&head->batch_lock); > list_add_tail(&sh->batch_list, &head->batch_list); > + set_bit(STRIPE_IN_BATCH_LIST, &sh->state); > spin_unlock(&head->batch_lock); > } > > @@ -6161,7 +6163,8 @@ static int retry_aligned_read(struct r5conf *conf, struct bio *raid_bio, > } > > set_bit(R5_ReadNoMerge, &sh->dev[dd_idx].flags); > - handle_stripe(sh); > + if (!test_bit(STRIPE_IN_BATCH_LIST, &sh->state)) > + handle_stripe(sh); > raid5_release_stripe(sh); > handled++; > } > diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h > index cf991f13403e..dee7a1dee505 100644 > --- a/drivers/md/raid5.h > +++ b/drivers/md/raid5.h > @@ -390,6 +390,7 @@ enum { > * in conf->r5c_full_stripe_list) > */ > STRIPE_R5C_PREFLUSH, /* need to flush journal device */ > + STRIPE_IN_BATCH_LIST, /* the stripe is already added to batch list */ I think we don't need a new flag. sh->batch_head should be sufficient. Thanks, Song