Re: [PATCH 2/2] raid5: update analysis state for failed stripe

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 23, 2015 at 04:21:58PM +1000, Neil Brown wrote:
> Shaohua Li <shli@xxxxxx> writes:
> 
> > handle_failed_stripe() makes the stripe fail, eg, all IO will return
> > with a failure, but it doesn't update stripe_head_state. Later
> > handle_stripe() has special handling for raid6 for handle_stripe_fill().
> > That check before handle_stripe_fill() doesn't skip the failed stripe
> > and we get a kernel crash in need_this_block.  This patch clear the
> > analysis state to make sure no functions wrongly called after
> > handle_failed_stripe()
> >
> > Signed-off-by: Shaohua Li <shli@xxxxxx>
> > ---
> >  drivers/md/raid5.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> >
> > diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> > index 394cdf8..8e4fb89a 100644
> > --- a/drivers/md/raid5.c
> > +++ b/drivers/md/raid5.c
> > @@ -3155,6 +3155,8 @@ handle_failed_stripe(struct r5conf *conf, struct stripe_head *sh,
> >  			spin_unlock_irq(&sh->stripe_lock);
> >  			if (test_and_clear_bit(R5_Overlap, &sh->dev[i].flags))
> >  				wake_up(&conf->wait_for_overlap);
> > +			if (bi)
> > +				s->to_read--;
> >  			while (bi && bi->bi_iter.bi_sector <
> >  			       sh->dev[i].sector + STRIPE_SECTORS) {
> >  				struct bio *nextbi =
> > @@ -3173,6 +3175,8 @@ handle_failed_stripe(struct r5conf *conf, struct stripe_head *sh,
> >  		 */
> >  		clear_bit(R5_LOCKED, &sh->dev[i].flags);
> >  	}
> > +	s->to_write = 0;
> > +	s->written = 0;
> >  
> >  	if (test_and_clear_bit(STRIPE_FULL_WRITE, &sh->state))
> >  		if (atomic_dec_and_test(&conf->pending_full_writes))
> > -- 
> > 1.8.1
> 
> Again, this probably is a sensible fix, but I would like to be certain.
> Where exactly in need_this_block does the kernel crash?  I cannot see
> anything that could cause an invalid address....


>>for (i = 0; i < s->failed; i++) {
>>                if (fdev[i]->towrite &&
the fdev[i]->towrite. because s->failed >=2 (it's 3 in my case), while
the array size is 2.

Thanks,
Shaohua
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux