On 8/29/19 7:42 AM, Song Liu wrote:
drivers/md/raid5.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index 88e56ee98976..e3dced8ad1b5 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -4612,8 +4612,7 @@ static void break_stripe_batch_list(struct stripe_head *head_sh,
list_del_init(&sh->batch_list);
- WARN_ONCE(sh->state & ((1 << STRIPE_ACTIVE) |
- (1 << STRIPE_SYNCING) |
+ WARN_ONCE(sh->state & ((1 << STRIPE_SYNCING) |
(1 << STRIPE_REPLACED) |
(1 << STRIPE_DELAYED) |
(1 << STRIPE_BIT_DELAY) |
I read the code again, and now I am not sure whether we are fixing the issue.
This WARN_ONCE() does not run for head_sh, which should have STRIPE_ACTIVE.
It only runs on other stripes in the batch, which should not have STRIPE_ACTIVE.
Does this make sense?
Yes, maybe some stripes are added to batch_list with STRIPE_ACTIVE is set.
could it possible caused by the path?
retry_aligned_read -> sh = raid5_get_active_stripe(conf, sector, 0, 1, 1)
...
if (!add_stripe_bio(sh, raid_bio,
dd_idx, 0, 0))
...
handle_stripe(sh)
The sh is added to batch_list by add_stripe_bio -> stripe_add_to_batch_list,
then the state is set to STRIPE_ACTIVE at the beginning of handle_stripe.
Maybe introduce a flag to show that the stripe is in batch_list, and
don't call
handle_stripe for that stripe, some rough changes like.
gjiang@ls00508:/media/gjiang/opensource-tree/md$ git diff
diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
index e3dced8ad1b5..c04d5b55848a 100644
--- a/drivers/md/raid5.c
+++ b/drivers/md/raid5.c
@@ -817,12 +817,14 @@ static void stripe_add_to_batch_list(struct r5conf
*conf, struct stripe_head *sh
* can still add the stripe to batch list
*/
list_add(&sh->batch_list, &head->batch_list);
+ set_bit(STRIPE_IN_BATCH_LIST, &sh->state);
spin_unlock(&head->batch_head->batch_lock);
} else {
head->batch_head = head;
sh->batch_head = head->batch_head;
spin_lock(&head->batch_lock);
list_add_tail(&sh->batch_list, &head->batch_list);
+ set_bit(STRIPE_IN_BATCH_LIST, &sh->state);
spin_unlock(&head->batch_lock);
}
@@ -6161,7 +6163,8 @@ static int retry_aligned_read(struct r5conf
*conf, struct bio *raid_bio,
}
set_bit(R5_ReadNoMerge, &sh->dev[dd_idx].flags);
- handle_stripe(sh);
+ if (!test_bit(STRIPE_IN_BATCH_LIST, &sh->state))
+ handle_stripe(sh);
raid5_release_stripe(sh);
handled++;
}
diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index cf991f13403e..dee7a1dee505 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -390,6 +390,7 @@ enum {
* in conf->r5c_full_stripe_list)
*/
STRIPE_R5C_PREFLUSH, /* need to flush journal device */
+ STRIPE_IN_BATCH_LIST, /* the stripe is already added to batch
list */
};
#define STRIPE_EXPAND_SYNC_FLAGS \
What do you think about it? Thanks.
Regards,
Guoqing