On 2012-09-25 14:45 NeilBrown <neilb@xxxxxxx> Wrote: >On Thu, 20 Sep 2012 15:17:54 +0800 "Jianpeng Ma" <majianpeng@xxxxxxxxx> wrote: > >> On 2012-09-20 14:47 NeilBrown <neilb@xxxxxxx> Wrote: >> >On Thu, 20 Sep 2012 14:34:00 +0800 "Jianpeng Ma" <majianpeng@xxxxxxxxx> wrote: >> > >> >> In func add_stripe_bio: >> >> >> ..... >> >> >> bip = &sh->dev[dd_idx].toread; >> >> >> ...... >> >> >>spin_unlock_irq(&sh->stripe_lock); >> >> >> >> >> pr_debug("added bi b#%llu to stripe s#%llu, disk %d.\n", >> >> >> (unsigned long long)(*bip)->bi_sector, >> >> >> (unsigned long long)sh->sector, dd_idx); >> >> After spin_unlock_irq, this thread scheded and toread may become null. >> >> So it will be oops. >> >> >> >> Signed-off-by: Jianpeng Ma <majianpeng@xxxxxxxxx> >> >> --- >> >> drivers/md/raid5.c | 3 ++- >> >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> >> >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c >> >> index adda94d..f172b1e 100644 >> >> --- a/drivers/md/raid5.c >> >> +++ b/drivers/md/raid5.c >> >> @@ -2356,6 +2356,7 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in >> >> struct bio **bip; >> >> struct r5conf *conf = sh->raid_conf; >> >> int firstwrite=0; >> >> + sector_t sector = bi->bi_sector; >> >> >> >> pr_debug("adding bi b#%llu to stripe s#%llu\n", >> >> (unsigned long long)bi->bi_sector, >> >> @@ -2406,7 +2407,7 @@ static int add_stripe_bio(struct stripe_head *sh, struct bio *bi, int dd_idx, in >> >> spin_unlock_irq(&sh->stripe_lock); >> >> >> >> pr_debug("added bi b#%llu to stripe s#%llu, disk %d.\n", >> >> - (unsigned long long)(*bip)->bi_sector, >> >> + (unsigned long long)sector, >> >> (unsigned long long)sh->sector, dd_idx); >> >> >> >> if (conf->mddev->bitmap && firstwrite) { >> > >> > >> >how about we just move the spin_unlock_irq after the pr_debug?? >> > >> ah! Why are you think ? my method only add a parameter. > >Yes. > >> BTW, in func handle_failed_stripe: >> >>if (!test_bit(R5_Wantfill, &sh->dev[i].flags) && >> >> (!test_bit(R5_Insync, &sh->dev[i].flags) || >> >> test_bit(R5_ReadError, &sh->dev[i].flags))) { >> >> bi = sh->dev[i].toread; >> >> sh->dev[i].toread = NULL; >> >> if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags)) >> >> wake_up(&conf->wait_for_overlap); >> Why use stripe_lock to protect toread? > >I assume you mean that we should be holding the lock to protect toread, but >we aren't. >I've queued a patch to fix that. > Hi, Last Saturday, i sent a patch-set which contained a patch which fix this bug. You can check your mail! Thanks! >Thanks. >NeilBrown > > >> >> Thanks! > >?韬{.n?????%??檩??w?{.n???{炳盯w???塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f