Re: Mirroring seek optimization resend

NeilBrown <neilb@xxxxxxx> · Tue, 9 Sep 2014 12:21:13 +1000

On Mon, 8 Sep 2014 10:13:12 -0700 Robert Long <boblong@xxxxxxxxxxxxxx> wrote:

> Hi Again,
> 
> First, the code base that I was working from is now 2 years old. While I doubt that the raid1.c code has changed significantly I could be very wrong. At that time the load balancing algorithm was simply to attempt to split the actively seeked area of the disk into two parts, and have one disk take the low half and one take the high half. The effect of that, in read intensive applications, is to cut in half the overall seek length for each disk in the mirror, resulting in considerable seek time savings and overall higher throughput. A moving average of sector addresses needs to be maintained to make this work, but this is quite easily done. The more writing happens, the more this advantage is lost. The problem with this approach, though not a large one, is that is that a numerical average is not actually representative of the middle of the work area due to outliers, clusters of reads, etc.
> 
> A solution to this is to maintain a running median of the sector address. When this is done you get a measurable increase in throughput, at the cost of greater computational overhead. The method that I used to maintain a moving median involves the creation of an equal sized min heap and max heap connected together at the head of each. This connection point represents the current median sector of access requests to the mirror.
> 
> Today I fired up the machine that I used to develop and test this modification. It took some time just to locate the modified raid1.c file :) The machine, in addition to the boot drive, is equipped with a couple of 1 TB Hitachi disk drives that were used for test purposes. To be honest I don’t remember the name of the test software that I used, but it seems that it was a pretty performance standard utility, not something I put together myself. Perhaps you could jog my memory. As I mentioned previously the improvement was around 7% with read only testing of random seeks.
> 
> If this is of interest I will gladly post the NOT PRODUCTION READY code. (I took some shortcuts in the code for testing purposes, primarily in the assignment of drives to the array so that I didn’t have to figure out how the existing code does such a nice job of drive allocation.)
> 
> On the other hand, if the improvement is deemed to be insufficient then I will chuck the code into the dust bin of history. Given the continued rise in the use of SSDs, seek optimization becomes less relevant every day.
> 
> Please give me some feedback, even if you are not interested let me know why.
> 

Hi Bob,
 If you post the code I'll try to find time to look at it.  If you don't, I
 won't :-)

 I personally have no pressing desire to improve RAID1 read performance, but
 others have occasionally shown an interest in the past.  Maybe they aren't
 paying attention or have lost interest.  Maybe they don't think their
 interest matters (which wouldn't be true).

 To be honest, "I've got some code that once did something good with some
 tests that I don't remember and probably isn't production ready and may not
 even work" just doesn't sound exciting.
 On the other hand "here is some code" at least provides something concrete
 to look at.

NeilBrown
Attachment:
signature.asc

Description: PGP signature