Re: [RFC 1/2]raid1: only write mismatch sectors in sync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 31 Oct 2012 11:25:33 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:

> On Thu, Oct 18, 2012 at 01:36:57PM +1100, NeilBrown wrote:
> > On Thu, 18 Oct 2012 10:01:34 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:
> > 
> > > On Thu, Oct 18, 2012 at 12:29:59PM +1100, NeilBrown wrote:
> > > > On Thu, 18 Oct 2012 09:17:35 +0800 Shaohua Li <shli@xxxxxxxxxx> wrote:
> > > >  
> > > > > > > Neil,
> > > > > > > any further comments on this? This is a usable feature, I hope we can have some
> > > > > > > agreements.
> > > > > > 
> > > > > > You still haven't answered my main question, which possibly means I haven't
> > > > > > asked it very clearly.
> > > > > > 
> > > > > > You are saying that this new behaviour should not be the default and I think
> > > > > > I agree.
> > > > > > So the question is:  how it is selected?
> > > > > > 
> > > > > > You cannot expect the user to explicitly enable it any time a resync or
> > > > > > recovery starts that should use this new feature.  You must have some
> > > > > > automatic, or semi-automatic, way for the feature to be activated, otherwise
> > > > > > it will never be used.
> > > > > > 
> > > > > > I'm not asking "when should the feature be used" - you've answered that
> > > > > > question a few time and it really isn't an issue.
> > > > > > The question it "What it the exact process by which the feature is turned on
> > > > > > for any particular resync or recovery?"
> > > > > 
> > > > > So you worried about users don't know how to correctly select the feature. An
> > > > > experienced user knows this, the usage scenario I mentioned describes how to do
> > > > > the decision. For example, a resync after system crash should enable the
> > > > > feature. I admit an inexperienced user doesn't know how to select it, but this
> > > > > isn't a big problem to me. There are a lot of tunables in the kernel (even MD),
> > > > > which can significantly impact kernel behavior. These tunables are just for
> > > > > experienced users.
> > > > > 
> > > > > Thanks,
> > > > > Shaohua
> > > > 
> > > > 
> > > > You still aren't answering my question.
> > > > 
> > > > What exactly, precisely, specifically, will an "experienced user" do?
> > > 
> > > Set something to a sysfs entry to enable the feature (like my RFC patch does to
> > > have a new sysfs entry for the feature), and readd disk. resync then does 'only
> > > write mismatch data'. Is this what you asked?
> 
> sorry for the delay.
>  
> > Yes, that is the sort of thing I was asking for.
> > When you say "readd disk" I assume you mean to use the --readd option to
> > mdadm.
> > The only works when there is a bitmap active on the array,  so relatively few
> > blocks will be resynced so does it really matter which approach is taken?
> > Always copy, or read-and-test?
> > 
> > Though maybe you really mean to "--add" the device.  In that case it would
> > probably make sense to add some other option to mdadm to say "enable
> > read-mostly recovery".  I wonder what a good name would be.
> > --minimize-writes ??
> 
> Yep, it's '--add' case. For the '--readd' with bitmap case, bitmap can already
> avoid a lot of write already. The useage case is something like:
> one disk is broken; trim whole disk of a new disk; add the new disk
> If source disk has a lot of 0 and we only write mismatch data, we can avoid
> write a lot.
> 
> I believe we need such mechanism for '--create' too, if the first disk has some
> data, but the second disk is empty.
>  
> > You earlier gave a list of scenarios in which you thought this would be
> > useful.  It was:
> > 
> > > > > For 'compare and avoid write if equal' case:
> > > > > 1. update SSD firmware. This doesn't change the data, but we need take one disk
> > > > > off from the raid one time.
> > > > > 2. One disk has errors, but these errors don't ruin most of the data (for
> > > > > example, a pcie error)
> > > > > 3. driver/os crash.
> > > > > In all these cases, two raid disks must be resync, and they have almost identical
> > > > > data. write avoidness will be very helpful for these.  
> > 
> > 
> > For case '3', it would be a "resync" rather than a "recovery".  How would you
> > expect an "advanced user" to choose read-and-test recovery in that case?
> > There is no "readd" command happening.
> 
> If there is bitmap, maybe we don't need do read-and-test, so this one isn't
> very necessary in current stage. If not, what I suggested is:
> 1. user suspends resync (write something to a sysfs file)
> 2. user enables read-and-test (again, write a sysfs file)
> 3. resume resync

So you are happy for the resync to start doing the wrong thing, and expect
the sysadmin to notice, and then take some obscure action to stop it doing
the wrong thing and start it doing the right thing.
Certainly possible, but very error prone I would think.

NeilBrown

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux