Re: raid5: two writing algorithms

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 08, 2008 at 12:51:39PM +1100, Neil Brown wrote:
> On Friday February 8, keld@xxxxxxxx wrote:
> > On Fri, Feb 08, 2008 at 07:25:31AM +1100, Neil Brown wrote:
> > > On Thursday February 7, keld@xxxxxxxx wrote:
> > 
> > > > So I hereby give the idea for inspiration to kernel hackers.
> > > 
> > > and I hereby invite you to read the code ;-)
> > 
> > I did some reading.  Is there somewhere a description of it, especially
> > the raid code, or are the comments and the code the best documentation?
> 
> No.  If a description was written (and various people have tried to
> describe various parts) it would be out of date within a few months :-(

OK, I was under the impression that some of the code did not change
much. Eg. you said that there had not been any work on optimizing
raid10 for performance since the 2.6.12 kernel I was using. And then at
least the raid5 code, the last copyright notice right in the top is 
Copyright (C) 2002, 2003 H. Peter Anvin. That is 5 years ago. 
And your name is not on it. So I did not look that much into that code,
thinking nothing had been done there for ages. Maybe you could add your
name on it, that would only be fair. The same comment goes for other
modules (for which it is relevant).

> Look for "READ_MODIFY_WRITE" and "RECONSTRUCT_WRITE" .... no.  That
> only applied to raid6 code now..
> Look instead for the 'rcw' and 'rmw' counters, and then at
> 'handle_write_operations5'  which does different things based on the
> 'rcw' variable.
> 
> It used to be a lot clearer before we implemented xor-offload.  The
> xor-offload stuff is good, but it does make the code more complex.

OK, I think it is fairly well documented here, I can at least follow the
logic, and then I think it is a good approach to have the flow
description/strategy included directly in the code. Given there are many
changes to the code, different files for code and description could
easily mix up the alignment of code and documentation badly.

> 
> 
> > 
> > Do you say that this is already implemented?
> 
> Yes.

That is very good!


Do you konw if other implementations of this, eg. commercial controller
code, have this facility? If not, we could list this as an advantage of 
linux raid. Anyway it would be implicit in performance documentation.
I do plan to write up something on performance, soonish. The howto
is hopelessly outdated.

IMHO such code should make the performance of raid5 random writes not
that bad. Better than the reputation that raid5 is hopelessly slow for
database writing. I think raid5 would be less than double as slow as
raid1 for random writing.

> > Well, I do have a hack in mind, on the raid10,f2.
> > I need to investigate some more, and possibly test out
> > what really happens. But maybe the code already does what I want it to.
> > You are possibly the one that knows the code best, so maybe you can tell
> > me if raid10,f2 always does its reading in the first part of the disks?
> 
> Yes, I know the code best.
> 
> No, raid10,f2 doesn't always use the first part of the disk.  Getting
> it to do that would be a fairly small change in 'read_balance' in
> md/raid10.c.
> 
> I'm not at all convinced that the read balancing code in raid10 (or
> raid1) really does the best thing.  So any improvements - backed up
> with broad testing - would be most welcome.

I think I know where to do my proposed changes, and how it could be done.
So maybe in a not too distant future I will have done my first kernel
hack!

Best regards
keld
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux