On Sun, Aug 03, 2008 at 02:54:13PM +0200, Keld Jørn Simonsen wrote: > On Sun, Aug 03, 2008 at 07:32:00AM -0500, Jon Nelson wrote: > > After digging through the code (admittedly, way too late at night), I > > think I have a basic understanding of how the resync code works, and > > why it appears to be suboptimal (speed-wise) for raid10. > > > > It would appear that, upon receipt of a 'check' (other resync methods > > have different paths, sometimes), md.c basically says, "start at the > > first sector or the first sector after the checkpoint and proceed > > logically through the end (unless told to stop)' and md.c schedules > > this check with the relevant sync_request method. For raid10, this > > finds the first device with that logical sector as a copy and then > > compares the data there to the data in all of the other copies on the > > other disks. For raid10 in f2 format (and to a less extent with the > > offset format) this is going to result in a great deal of thrashing. > > I'm guessing this is the reason why a 'check' operation raid10,f2 > > takes 2x as long as for raid5 (same disks). One way to improve the > > efficiency here would be to perform a loop like this: > > > > for device in devices: > > for chunk that is not a mirror: > > read chunk > > compare chunk to mirror chunks on other devices > > > > If I'm not wrong this should result in near streaming speeds from each > > device with a minimum of seeking. However, to effect this change it > > looks like the changes would be more invasive than just changing > > raid10.c. One way, of course, might be to abstract the sync code just > > a bit more so that md.c could ask each device to provide a function > > which does the driving (the above 4 lines) and md.c does all of the > > common error checking, interrupt checking, etc... Does this seem like > > crazy talk? If I can get some help I might give it a stab. > > My idea is to do the checks in bigger blocks, then you would minimize > the trashing, by minimizing the number of times you need to move the > head. And this would not need much change in the code. I have done a > patch to do this, but I have not yet tested it. Maybe you could test the patch? enclosed Best regards keld
--- raid10.c 2008-07-12 18:28:59.438235317 +0200 +++ raid10.c~ 2008-07-03 05:46:47.000000000 +0200 @@ -80,7 +80,7 @@ //#define RESYNC_BLOCK_SIZE PAGE_SIZE #define RESYNC_SECTORS (RESYNC_BLOCK_SIZE >> 9) #define RESYNC_PAGES ((RESYNC_BLOCK_SIZE + PAGE_SIZE-1) / PAGE_SIZE) -#define RESYNC_WINDOW (2048*1024*16) +#define RESYNC_WINDOW (2048*1024) /* * When performing a resync, we need to read and compare, so @@ -686,7 +686,7 @@ * there is no normal IO happeing. It must arrange to call * lower_barrier when the particular background IO completes. */ -#define RESYNC_DEPTH 32*16 +#define RESYNC_DEPTH 32 static void raise_barrier(conf_t *conf, int force) {