Re: Re: [RFC PATCH] raid5: Add R5_ReadNoMerge flag which prevent bio from merging at block layer

majianpeng <majianpeng@xxxxxxxxx> · Tue, 3 Jul 2012 16:52:49 +0800

On 2012-07-03 14:38 NeilBrown <neilb@xxxxxxx> Wrote:
>On Tue, 3 Jul 2012 14:23:41 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote:
>
>> Because bios will merge at block-layer,so bios-error may caused by other
>> bio which be merged into to the same request.
>> Using this flag,it will find exactly error-sector and not do redundant
>> operation like re-write and re-read.
>> 
>> Signed-off-by: majianpeng <majianpeng@xxxxxxxxx>
>
>Hi,
> I think this patch needs a more detailed explanation.
>
>What exactly is the situation that causes a problem, and what exactly is the
>problem that it causes?
>
>Pretend that I don't know anything about what happens below the md level..
>
>Thanks,
>NeilBrown
>
>
>> ---
>>  block/blk-core.c   |    8 ++++++++
>>  drivers/md/raid5.c |   16 +++++++++++++---
>>  drivers/md/raid5.h |    1 +
>>  3 files changed, 22 insertions(+), 3 deletions(-)
>> 
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 3c923a7..ee04bfc 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1401,6 +1401,9 @@ void init_request_from_bio(struct request *req, struct bio *bio)
>>  	if (bio->bi_rw & REQ_RAHEAD)
>>  		req->cmd_flags |= REQ_FAILFAST_MASK;
>>  
>> +	if (unlikely(bio->bi_rw & REQ_NOMERGE))
>> +		req->cmd_flags |= REQ_NOMERGE;
>> +
>>  	req->errors = 0;
>>  	req->__sector = bio->bi_sector;
>>  	req->ioprio = bio_prio(bio);
>> @@ -1428,6 +1431,11 @@ void blk_queue_bio(struct request_queue *q, struct bio *bio)
>>  		goto get_rq;
>>  	}
>>  
>> +	if (unlikely(bio->bi_rw & REQ_NOMERGE)) {
>> +		spin_lock_irq(q->queue_lock);
>> +		where = ELEVATOR_INSERT_BACK;
>> +		goto get_rq;
>> +	}
>>  	/*
>>  	 * Check if we can merge with the plugged list before grabbing
>>  	 * any locks.
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index d267672..04f78d2 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -632,6 +632,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
>>  			else
>>  				bi->bi_sector = (sh->sector
>>  						 + rdev->data_offset);
>> +			if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
>> +				bi->bi_rw |= REQ_NOMERGE;
>> +
>>  			bi->bi_flags = 1 << BIO_UPTODATE;
>>  			bi->bi_idx = 0;
>>  			bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
>> @@ -1731,7 +1734,9 @@ static void raid5_end_read_request(struct bio * bi, int error)
>>  			atomic_add(STRIPE_SECTORS, &rdev->corrected_errors);
>>  			clear_bit(R5_ReadError, &sh->dev[i].flags);
>>  			clear_bit(R5_ReWrite, &sh->dev[i].flags);
>> -		}
>> +		} else if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
>> +			clear_bit(R5_ReadNoMerge, &sh->dev[i].flags);
>> +
>>  		if (atomic_read(&rdev->read_errors))
>>  			atomic_set(&rdev->read_errors, 0);
>>  	} else {
>> @@ -1773,7 +1778,11 @@ static void raid5_end_read_request(struct bio * bi, int error)
>>  		else
>>  			retry = 1;
>>  		if (retry)
>> -			set_bit(R5_ReadError, &sh->dev[i].flags);
>> +			if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags)) {
>> +				set_bit(R5_ReadError, &sh->dev[i].flags);
>> +				clear_bit(R5_ReadNoMerge, &sh->dev[i].flags);
>> +			} else
>> +				set_bit(R5_ReadNoMerge, &sh->dev[i].flags);
>>  		else {
>>  			clear_bit(R5_ReadError, &sh->dev[i].flags);
>>  			clear_bit(R5_ReWrite, &sh->dev[i].flags);
>> @@ -4481,7 +4490,8 @@ static int  retry_aligned_read(struct r5conf *conf, struct bio *raid_bio)
>>  			conf->retry_read_aligned = raid_bio;
>>  			return handled;
>>  		}
>> -
>> +		if (likely(raid_bio->bi_size >> 9) > STRIPE_SECTORS)
>> +			set_bit(R5_ReadNoMerge, &sh->dev[dd_idx].flags);
>>  		handle_stripe(sh);
>>  		release_stripe(sh);
>>  		handled++;
>> diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
>> index 2164021..6767d07 100644
>> --- a/drivers/md/raid5.h
>> +++ b/drivers/md/raid5.h
>> @@ -273,6 +273,7 @@ enum r5dev_flags {
>>  	R5_Wantwrite,
>>  	R5_Overlap,	/* There is a pending overlapping request
>>  			 * on this block */
>> +	R5_ReadNoMerge, /* prevent bio from merging in block-layer */
>>  	R5_ReadError,	/* seen a read error here recently */
>>  	R5_ReWrite,	/* have tried to over-write the readerror */
>>  
>
>
How about the below explanation:

Because bio will be merged at block-layer,so bios-error may caused by other
bio which be merged into to the same request.
For example: if chunk_aligned_read failed, it will add bio to some stipe.
But because the bio-merge function,those bios at most be merged.It will like
the chunk_aligned_read and returen error.Then it will re-write and re-read.
Suppose RAID5 created by n disk and chunk-size is 512K.
If read 512k chunk_aligned and met error(sector 0 is media error), 
then add 512/4=128 stipe.If those bios merged and must be error.The rewrite operation
will read (n-1) * 128 and computer 128 stripe.But using this flag,we only exec one rewrite.
This occur in resync/repair situation and chunk-aligned-read.

May be using this flag,we can find exact bad-sector.Not recorded the whole request bios 
as bad-sector.?韬{.n?????%??檩??w?{.n???{炳盯w???塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f