>On Thu, 31 Oct 2013 11:20:55 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote: > >Sorry for the delayed reply. > > >> >On Tue, 29 Oct 2013 09:30:14 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote: >> > >> > >> >Nearly there!! Just a few more details. See below. >> > >> > >> >> Signed-off-by: Jianpeng Ma <majianpeng@xxxxxxxxx> >> >> --- >> >> drivers/md/raid1.c | 122 +++++++++++++++++++++++++++++++++++++++++++++++------ >> >> drivers/md/raid1.h | 14 ++++++ >> >> 2 files changed, 124 insertions(+), 12 deletions(-) >> >> >> >> diff --git a/drivers/md/raid1.c b/drivers/md/raid1.c >> >> index b4a6dcd..5b311c0 100644 >> >> --- a/drivers/md/raid1.c >> >> +++ b/drivers/md/raid1.c >> >> @@ -66,7 +66,8 @@ >> >> */ >> >> static int max_queued_requests = 1024; >> >> >> >> -static void allow_barrier(struct r1conf *conf); >> >> +static void allow_barrier(struct r1conf *conf, sector_t start_next_window, >> >> + sector_t bi_sector); >> >> static void lower_barrier(struct r1conf *conf); >> >> >> >> static void * r1bio_pool_alloc(gfp_t gfp_flags, void *data) >> >> @@ -227,6 +228,8 @@ static void call_bio_endio(struct r1bio *r1_bio) >> >> struct bio *bio = r1_bio->master_bio; >> >> int done; >> >> struct r1conf *conf = r1_bio->mddev->private; >> >> + sector_t start_next_window = r1_bio->start_next_window; >> >> + sector_t bi_sector = bio->bi_sector; >> > >> >This should be r1_bio->sector, not bio->bi_sector. >> >They are often the same but if multiple r1_bios are needed for some reason >> >(e.g. bad blocks) they may not be. >> > >> No, allow_barrier() only for bio not for r1bio.It only do when bio->bi_phys_segments == 0. > >Yes, I see that now, thanks. > > >> If this value is r1_bio->sector, the value may has different value as you said. >> So in allow_barrier(): >> if (start_next_window) { >> if (start_next_window == conf->start_next_window) { >> if (conf->start_next_window + NEXT_NORMALIO_DISTANCE >> <= bi_sector) >> conf->next_window_requests--; >> else >> conf->current_window_requests--; >> } else >> It will has differnt result. >> >> >> >> >> if (bio->bi_phys_segments) { >> >> unsigned long flags; >> >> @@ -234,6 +237,11 @@ static void call_bio_endio(struct r1bio *r1_bio) >> >> bio->bi_phys_segments--; >> >> done = (bio->bi_phys_segments == 0); >> >> spin_unlock_irqrestore(&conf->device_lock, flags); >> >> + /* >> >> + * make_request() might be waiting for >> >> + * bi_phys_segments to decrease >> >> + */ >> >> + wake_up(&conf->wait_barrier); >> >> } else >> >> done = 1; >> >> >> >> @@ -245,7 +253,7 @@ static void call_bio_endio(struct r1bio *r1_bio) >> >> * Wake up any possible resync thread that waits for the device >> >> * to go idle. >> >> */ >> >> - allow_barrier(conf); >> >> + allow_barrier(conf, start_next_window, bi_sector); >> >> } >> >> } >> >> >> >> @@ -827,10 +835,18 @@ static void raise_barrier(struct r1conf *conf) >> >> /* block any new IO from starting */ >> >> conf->barrier++; >> >> >> >> - /* Now wait for all pending IO to complete */ >> >> + /* For those conditions we must wait: >> >> + * A:while the array is in frozen state >> >> + * B:while barrier >= RESYNC_DEPTH, meaning resync reach >> >> + * the max count which allowed. >> >> + * C:next_resync + RESYNC_SECTORS > start_next_window, meaning >> >> + * next resync will reach to window which normal bios are handling. >> >> + */ >> >> wait_event_lock_irq(conf->wait_barrier, >> >> !conf->array_frozen && >> >> - !conf->nr_pending && conf->barrier < RESYNC_DEPTH, >> >> + conf->barrier < RESYNC_DEPTH && >> >> + (conf->start_next_window >= >> >> + conf->next_resync + RESYNC_SECTORS), >> >> conf->resync_lock); >> > >> >You've removed the test on conf->nr_pending here, which I think is correct. >> >It counts 'read' requests as well. Testing start_next_window serves the same >> >purpose as it is increased whenever current_window_requests reaches zero. >> > >> In func wait_barrier(): >> if (need_to_wait_for_sync(conf, bio)) { >> [snip] >> } else if (bio_data_dir(bio) == WRITE) { >> if (conf->next_resync + NEXT_NORMALIO_DISTANCE >> <= bio->bi_sector) { >> if (conf->start_next_window == MaxSector) >> conf->start_next_window = >> conf->next_resync + >> NEXT_NORMALIO_DISTANCE; >> >> if ((conf->start_next_window + NEXT_NORMALIO_DISTANCE) >> <= bio->bi_sector) >> conf->next_window_requests++; >> else >> conf->current_window_requests++; >> } >> if (bio->bi_sector >= conf->start_next_window) >> sector = conf->start_next_window; >> } >> start_next_window only for bio_data_dir(bio) == WRITE. So resync can't wait read io to complete. >> >> >However you having modified as similar test on nr_pending in wait_barrier(). >I meant: haven't modified a similar test .... >> >That worries me a bit. Should that be changed to a test on start_next_window >> >to match the above change? >> For this condition, i only copy it.For this condition, i'm not know clearly. >> Can you give me more details about this? > > >Before your patch we know if there are pending requests (submitted but not >completed) which stop use from doing resync if conf->nr_pending > 0. >After your patch we cannot use that that test as it counts READ requests, and >requests that are outside the range being resynced. > >The test is now a bit more complicated. I think it is: > > conf->start_next_window < > conf->next_resync + RESYNC_SECTORS > > >i.e. while start_next_window is less than next_resync+RESYNC_SECTORS there >cold be outstanding writes that are blocking resync. As soon as those writes >complete, start_next_window will increase and the resync can complete. >So if start_next_window < next_resync+RESYNC_SECTORS and there are pending >request in current->bio_list, then then there is no point waiting for resync >(it cannot progress anyway) so we may as well submit another write. > >So change: > > wait_event_lock_irq(conf->wait_barrier, > !conf->array_frozen && > (!conf->barrier || > (conf->nr_pending && > current->bio_list && > !bio_list_empty(current->bio_list))), > conf->resync_lock); > >in wait barrier to > > wait_event_lock_irq(conf->wait_barrier, > !conf->array_frozen && > (!conf->barrier || > (conf->start_next_window < conf->next_resync + RESYNC_SECTORS && > current->bio_list && > !bio_list_empty(current->bio_list))), > conf->resync_lock); > > Ok, apply this. >> > >> > >> >> >> >> spin_unlock_irq(&conf->resync_lock); >> >> @@ -846,10 +862,33 @@ static void lower_barrier(struct r1conf *conf) >> >> wake_up(&conf->wait_barrier); >> >> } >> >> >> >> -static void wait_barrier(struct r1conf *conf) >> >> +static bool need_to_wait_for_sync(struct r1conf *conf, struct bio *bio) >> >> +{ >> >> + bool wait = false; >> >> + >> >> + if (conf->array_frozen || !bio) >> >> + wait = true; >> >> + else if (conf->barrier && bio_data_dir(bio) == WRITE) { >> >> + if (conf->next_resync < RESYNC_WINDOW_SECTORS) >> >> + wait = true; >> >> + else if ((conf->next_resync - RESYNC_WINDOW_SECTORS >> >> + >= bio_end_sector(bio)) || >> >> + (conf->next_resync + NEXT_NORMALIO_DISTANCE >> >> + <= bio->bi_sector)) >> >> + wait = false; >> >> + else >> >> + wait = true; >> >> + } >> >> + >> >> + return wait; >> >> +} >> >> + >> >> +static sector_t wait_barrier(struct r1conf *conf, struct bio *bio) >> >> { >> >> + sector_t sector = 0; >> >> + >> >> spin_lock_irq(&conf->resync_lock); >> >> - if (conf->barrier) { >> >> + if (need_to_wait_for_sync(conf, bio)) { >> >> conf->nr_waiting++; >> >> /* Wait for the barrier to drop. >> >> * However if there are already pending >> >> @@ -868,16 +907,57 @@ static void wait_barrier(struct r1conf *conf) >> >> !bio_list_empty(current->bio_list))), >> >> conf->resync_lock); >> >> conf->nr_waiting--; >> >> + } else if (bio_data_dir(bio) == WRITE) { >> >> + if (conf->next_resync + NEXT_NORMALIO_DISTANCE >> >> + <= bio->bi_sector) { >> >> + if (conf->start_next_window == MaxSector) >> >> + conf->start_next_window = >> >> + conf->next_resync + >> >> + NEXT_NORMALIO_DISTANCE; >> >> + >> >> + if ((conf->start_next_window + NEXT_NORMALIO_DISTANCE) >> >> + <= bio->bi_sector) >> >> + conf->next_window_requests++; >> >> + else >> >> + conf->current_window_requests++; >> >> + } >> >> + if (bio->bi_sector >= conf->start_next_window) >> >> + sector = conf->start_next_window; >> > >> >You aren't setting 'sector' if we needed to wait. I don't think that is >> >correct, is it? >> > >> Yes.How about those code: >> spin_lock_irq(&conf->resync_lock); >> retry_check: >> if (need_to_wait_for_sync(conf, bio)) { >> conf->nr_waiting++; >> /* Wait for the barrier to drop. >> * However if there are already pending >> * requests (preventing the barrier from >> * rising completely), and the >> * pre-process bio queue isn't empty, >> * then don't wait, as we need to empty >> * that queue to get the nr_pending >> * count down. >> */ >> wait_event_lock_irq(conf->wait_barrier, >> !conf->array_frozen && >> (!conf->barrier || >> (conf->nr_pending && >> current->bio_list && >> !bio_list_empty(current->bio_list))), >> conf->resync_lock); >> conf->nr_waiting--; >> goto retry_check; >> > > >It would look a lot better if you made it a while loop: > > while (need_to_wait....) { > nr_waiting++ > wait_event..... > nr_waiting-- > if (bio_data_dir..... > For this, i think it don;t need while() or recheck. The code should: if (nedd_to_wait...) { nr_waiting++ wait_event..... nr_waiting-- } if (bio && bio_data_dir(bio) == WRITE) { do; } I think again need_to_wait...(),it must return false. So we don't recheck this. Or am I missing something? Thanks! Jianpeng Ma ?韬{.n?????%??檩??w?{.n???{炳盯w???塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f