On Tue, 3 Jul 2012 16:52:49 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote: > On 2012-07-03 14:38 NeilBrown <neilb@xxxxxxx> Wrote: > >On Tue, 3 Jul 2012 14:23:41 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote: > > > >> Because bios will merge at block-layer,so bios-error may caused by other > >> bio which be merged into to the same request. > >> Using this flag,it will find exactly error-sector and not do redundant > >> operation like re-write and re-read. > >> > >> Signed-off-by: majianpeng <majianpeng@xxxxxxxxx> > > > >Hi, > > I think this patch needs a more detailed explanation. > > > >What exactly is the situation that causes a problem, and what exactly is the > >problem that it causes? > > > >Pretend that I don't know anything about what happens below the md level.. > > > >Thanks, > >NeilBrown > > > > > >> --- > >> block/blk-core.c | 8 ++++++++ > >> drivers/md/raid5.c | 16 +++++++++++++--- > >> drivers/md/raid5.h | 1 + > >> 3 files changed, 22 insertions(+), 3 deletions(-) > >> > >> diff --git a/block/blk-core.c b/block/blk-core.c > >> index 3c923a7..ee04bfc 100644 > >> --- a/block/blk-core.c > >> +++ b/block/blk-core.c > >> @@ -1401,6 +1401,9 @@ void init_request_from_bio(struct request *req, struct bio *bio) > >> if (bio->bi_rw & REQ_RAHEAD) > >> req->cmd_flags |= REQ_FAILFAST_MASK; > >> > >> + if (unlikely(bio->bi_rw & REQ_NOMERGE)) > >> + req->cmd_flags |= REQ_NOMERGE; > >> + > >> req->errors = 0; > >> req->__sector = bio->bi_sector; > >> req->ioprio = bio_prio(bio); > >> @@ -1428,6 +1431,11 @@ void blk_queue_bio(struct request_queue *q, struct bio *bio) > >> goto get_rq; > >> } > >> > >> + if (unlikely(bio->bi_rw & REQ_NOMERGE)) { > >> + spin_lock_irq(q->queue_lock); > >> + where = ELEVATOR_INSERT_BACK; > >> + goto get_rq; > >> + } > >> /* > >> * Check if we can merge with the plugged list before grabbing > >> * any locks. > >> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c > >> index d267672..04f78d2 100644 > >> --- a/drivers/md/raid5.c > >> +++ b/drivers/md/raid5.c > >> @@ -632,6 +632,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s) > >> else > >> bi->bi_sector = (sh->sector > >> + rdev->data_offset); > >> + if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags)) > >> + bi->bi_rw |= REQ_NOMERGE; > >> + > >> bi->bi_flags = 1 << BIO_UPTODATE; > >> bi->bi_idx = 0; > >> bi->bi_io_vec[0].bv_len = STRIPE_SIZE; > >> @@ -1731,7 +1734,9 @@ static void raid5_end_read_request(struct bio * bi, int error) > >> atomic_add(STRIPE_SECTORS, &rdev->corrected_errors); > >> clear_bit(R5_ReadError, &sh->dev[i].flags); > >> clear_bit(R5_ReWrite, &sh->dev[i].flags); > >> - } > >> + } else if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags)) > >> + clear_bit(R5_ReadNoMerge, &sh->dev[i].flags); > >> + > >> if (atomic_read(&rdev->read_errors)) > >> atomic_set(&rdev->read_errors, 0); > >> } else { > >> @@ -1773,7 +1778,11 @@ static void raid5_end_read_request(struct bio * bi, int error) > >> else > >> retry = 1; > >> if (retry) > >> - set_bit(R5_ReadError, &sh->dev[i].flags); > >> + if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags)) { > >> + set_bit(R5_ReadError, &sh->dev[i].flags); > >> + clear_bit(R5_ReadNoMerge, &sh->dev[i].flags); > >> + } else > >> + set_bit(R5_ReadNoMerge, &sh->dev[i].flags); > >> else { > >> clear_bit(R5_ReadError, &sh->dev[i].flags); > >> clear_bit(R5_ReWrite, &sh->dev[i].flags); > >> @@ -4481,7 +4490,8 @@ static int retry_aligned_read(struct r5conf *conf, struct bio *raid_bio) > >> conf->retry_read_aligned = raid_bio; > >> return handled; > >> } > >> - > >> + if (likely(raid_bio->bi_size >> 9) > STRIPE_SECTORS) > >> + set_bit(R5_ReadNoMerge, &sh->dev[dd_idx].flags); > >> handle_stripe(sh); > >> release_stripe(sh); > >> handled++; > >> diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h > >> index 2164021..6767d07 100644 > >> --- a/drivers/md/raid5.h > >> +++ b/drivers/md/raid5.h > >> @@ -273,6 +273,7 @@ enum r5dev_flags { > >> R5_Wantwrite, > >> R5_Overlap, /* There is a pending overlapping request > >> * on this block */ > >> + R5_ReadNoMerge, /* prevent bio from merging in block-layer */ > >> R5_ReadError, /* seen a read error here recently */ > >> R5_ReWrite, /* have tried to over-write the readerror */ > >> > > > > > How about the below explanation: > > Because bio will be merged at block-layer,so bios-error may caused by other > bio which be merged into to the same request. > For example: if chunk_aligned_read failed, it will add bio to some stipe. > But because the bio-merge function,those bios at most be merged.It will like > the chunk_aligned_read and returen error.Then it will re-write and re-read. > Suppose RAID5 created by n disk and chunk-size is 512K. > If read 512k chunk_aligned and met error(sector 0 is media error), > then add 512/4=128 stipe.If those bios merged and must be error.The rewrite operation > will read (n-1) * 128 and computer 128 stripe.But using this flag,we only exec one rewrite. > This occur in resync/repair situation and chunk-aligned-read. > > May be using this flag,we can find exact bad-sector.Not recorded the whole request bios > as bad-sector Thanks. That make some sense. I wonder if we can just use REQ_FUA to stop requests being merged, rather than create a new flag. In general we should probably be using REQ_FUA in all the cases where we are checking for, and trying to fix, read errors. Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature