Luca Berra <bluca@xxxxxxxxxx> wrote:If we want to do data-replication, access to the data-replicated device should be controlled by the data replication process (*), md does not guarantee this.
Well, if one writes to the md device, then md does guarantee this - but I find it hard to parse the statement. Can you elaborate a little in order to reduce my possible confusion?
I'll try
in fault tolerant architechture where we have two systems each with a
local storage which is exposed to the other system via nbd or similar.
One node is active and writes data to an md device composed from the
local storage and the nbd device.
The other node is stand-by and ready to take the place of the former in
case it fails.
I assume the data replication is synchronous at the moment (the write system
call returns when io has been submitted to both the underlying devices) (*)
we can have a series of failures which must be accounted for and dealt with according to a policy that might be site specific.
A) Failure of the standby node A.1) the active is allowed to continue in the absence of a data replica A.2) disk writes from the active should return an error. we can configure this setting in advance.
B) Failure of the active node B.1) the standby node takes immediately ownership of data and resumes processing B.2) the standby node remains idle
C) communication failure between the two nodes (and we don't have an external mechanism to arbitrate the split brain condition) C.1) both system panic and halt C.2) A1 + B2 C.3) A2 + B2 C.4) A1 + B1 C.5) A2 + B1 (which hopefully will go to A2 itself)
D) communication failure between the two nodes (admitting we have an external mechanism to arbitrate the split brain condition) D.1) A1 + B2 D.2) A2 + B2 D.2) B1 then A1 D.3) B1 then A2
E) rolling failure (C, then B)
F) rolling failure (D, then B)
G) a failed nodes is restored
H) a node (re)starts while the other is failed
I) a node (re)starts during C
J) a node (re)starts during D
K) a node (re)starts during E
L) a node (re)starts during F
scenarios without a sub-scenarios are left as an exercise to the reader, or i might find myself losing a job :)
now evaluate all scenarios under the following drivers: 1) data availability above all others 2) replica of data above all others 3) data availability above replica, but data consistency above availability
(*) if you got this far, add asynchronous replicas to the picture.
Regards, Luca
-- Luca Berra -- bluca@xxxxxxxxxx Communication Media & Services S.r.l. /"\ \ / ASCII RIBBON CAMPAIGN X AGAINST HTML MAIL / \ - To unsubscribe from this list: send the line "unsubscribe linux-raid" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html