Dear listmates,
Floating about the Internet are many howtos and references to backing GNBD
with DRBD in order to have failover GNBD and mount GFS atop of the GNBD
device. Does anyone know how the following possible race condition is
handled?
1. GFS writes to its GNBD device.
GNBD client node writes to GNBD server node.
GNBD server writes to DRBD-primary.
DRBD begins to write to itself and to DRBD-secondary.
Before DRBD completes the write to DRBD-secondary (thus, before
it returns since writes are synchronous) the DRBD-primary node
looses power.
The GNBD server dies with the power loss.
GNBD client node drops connection to the GNBD server.
2. Heartbeat notices the death of DRBD-primary, switches the
DRBD-secondary to DRBD-primary, re-exports /dev/drbd0 via GNBD, and
re-creates the virtual IP which the GNBD client was connecting to.
3. The GNBD client writing on behalf of GFS reconnects.
Now, what happens to the write originally going to the DRBD volume? Will
the GNBD-client retry the write? Are there situations where the write
could be dropped all together?
Are there other kinds of race conditions which could take place? Other
concerns outside of this scenario?
We are thinking about implementing DRBD+GNBD+GFS+Xen to support failover
and domain migration. In the event of a failure like power loss, I would
like to be certain that when the failed-to GNBD server node comes online,
that any GNBD clients which were half-way through a write will re-commit
the write.
Thoughts?
-Eric
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster