On Tue, 17 Apr 2012 13:09:38 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote: > > REQ_SYNC is ignored in current raid5 code. Block layer does use it to do > policy, > for example ioscheduler. This patch adds it. > > Signed-off-by: Shaohua Li <shli@xxxxxxxxxxxx> > --- > drivers/md/raid5.c | 12 ++++++++++-- > drivers/md/raid5.h | 1 + > 2 files changed, 11 insertions(+), 2 deletions(-) > > Index: linux/drivers/md/raid5.c > =================================================================== > --- linux.orig/drivers/md/raid5.c 2012-04-17 11:59:24.556008869 +0800 > +++ linux/drivers/md/raid5.c 2012-04-17 12:06:06.306007114 +0800 > @@ -518,6 +518,8 @@ static void ops_run_io(struct stripe_hea > replace_only = 1; > } else > continue; > + if (test_and_clear_bit(R5_SyncIO, &sh->dev[i].flags)) > + rw |= REQ_SYNC; > > bi = &sh->dev[i].req; > rbi = &sh->dev[i].rreq; /* For writing to replacement */ > @@ -1114,6 +1116,8 @@ ops_run_biodrain(struct stripe_head *sh, > dev->sector + STRIPE_SECTORS) { > if (wbi->bi_rw & REQ_FUA) > set_bit(R5_WantFUA, &dev->flags); > + if (wbi->bi_rw & REQ_SYNC) > + set_bit(R5_SyncIO, &dev->flags); > tx = async_copy_data(1, wbi, dev->page, > dev->sector, tx); > wbi = r5_next_bio(wbi, dev->sector); > @@ -1131,13 +1135,15 @@ static void ops_complete_reconstruct(voi > int pd_idx = sh->pd_idx; > int qd_idx = sh->qd_idx; > int i; > - bool fua = false; > + bool fua = false, sync = false; > > pr_debug("%s: stripe %llu\n", __func__, > (unsigned long long)sh->sector); > > - for (i = disks; i--; ) > + for (i = disks; i--; ) { > fua |= test_bit(R5_WantFUA, &sh->dev[i].flags); > + sync |= test_bit(R5_SyncIO, &sh->dev[i].flags); > + } > > for (i = disks; i--; ) { > struct r5dev *dev = &sh->dev[i]; > @@ -1146,6 +1152,8 @@ static void ops_complete_reconstruct(voi > set_bit(R5_UPTODATE, &dev->flags); > if (fua) > set_bit(R5_WantFUA, &dev->flags); > + if (sync) > + set_bit(R5_SyncIO, &dev->flags); > } > } > > Index: linux/drivers/md/raid5.h > =================================================================== > --- linux.orig/drivers/md/raid5.h 2012-04-17 11:59:26.866008859 +0800 > +++ linux/drivers/md/raid5.h 2012-04-17 12:00:29.196008586 +0800 > @@ -285,6 +285,7 @@ enum r5dev_flags { > */ > R5_Wantdrain, /* dev->towrite needs to be drained */ > R5_WantFUA, /* Write should be FUA */ > + R5_SyncIO, /* The IO is sync */ > R5_WriteError, /* got a write error - need to record it */ > R5_MadeGood, /* A bad block has been fixed by writing to it */ > R5_ReadRepl, /* Will/did read from replacement rather than orig */ Thanks. There was some interesting white-space damage in this patch, but I sorted it out and applied it. However I'm not convinced that it is correct. It is probably better than nothing so I'll allow it for now, but you might want to fix it up to make it better. The point of REQ_SYNC is to remove any delays in handling the write, and get it to disk as soon as possible. However with RAID5, the write isn't complete until all the blocks in the stripe have been written, so just setting REQ_SYNC on the write for one of the blocks may not achieve the goal. You at least need to use REQ_SYNC on the parity block, and you probably want it on all the blocks in the stripe. So rather than setting R5_SyncIO on the dev, you should be setting in on the whole stripe_head. Similarly WantFUA should affect all device writes in the strip as they all need to hit the media for any of the data to be considered 'safe'. If you would like to fix both of those issues at once, that would be great :-) Thanks, NeilBrown
Attachment:
signature.asc
Description: PGP signature