Since, I missed on important setup details, I am sending an addendum to
explain in more detail how to setup a cluster-md.
Requirements
============
You would require a multi-node cluster setup using corosync and
pacemaker. You can read more about how to setup a cluster from the
one of the guides available [3]. Make sure that you are using corosync
version greater than 2.3.1
You need to have the Distributed Lock Manager (DLM) service running on
all the nodes of the cluster. A simple CRM configuration on my virtual
cluster is:
node 1084752262: node3
node 1084752351: node1
node 1084752358: node2
primitive dlm ocf:pacemaker:controld \
op start interval=0 timeout=90 \
op stop interval=0 timeout=100 \
op monitor interval=60 timeout=60
primitive stone stonith:external/libvirt \
params hostlist="node1,node2,node3"
hypervisor_uri="qemu+tcp://vmhnost/system" \
op start timeout=120s interval=0 \
op stop timeout=120s interval=0 \
op monitor interval=40s timeout=120s \
meta target-role=Started
group base-group dlm
clone base-clone base-group \
meta interleave=true target-role=Started
Note: The configuration may be different for your scenario.
This work requires some patches for the mdadm tool [1]. The changes in
mdadm are basic enough to get the clustered-md up and running. There are
a couple of options checks which are missing. Use with care. You would
need the corosync libraries to compile cluster related stuff in mdadm.
Download and install the patched mdadm.
A howto to use cluster-md:
1. With your corosync/pacemaker based cluster with DLM running execute:
# mdadm --create md0 --bitmap=clustered --raid-devices=2 --level=mirror
--assume-clean <device1> <device2>
With the option of --bitmap=clustered, it automatically creates multiple
bitmaps (one for each node). The default currently is set to 4 nodes.
However, you can set it by --nodes=<number> option.
It also detects the cluster name which is required for creating a
clustered md. In order to specify that, use --cluster-name=<name>
2. On other nodes, issue:
# mdadm --assemble md0 <device1> <device2>
This md device can be used as a regular shared device. There are no
restrictions on the type of filesystem or LVM you can use, as long as
you observe clustering rules of using a shared device.
There is only one special case as opposed to a regular non-clustered md,
which is to add a device. This is because all nodes should be able to
"see" the device before adding it.
You can (hot) add a spare device by issuing the regular --add command.
# mdadm --manage /dev/md0 --add <device3>
The other nodes must acknowledge that they see the device by issuing:
# mdadm --manage /dev/md0 --cluster-confirm 2:<device3>
where 2 is the raid slot number. This step can be automated
using a udev script because the module sends a udev event when another
node issues an --add. The uevent is with the usual device name
parameters and:
EVENT=ADD_DEVICE
DEVICE_UUID=<uuid of the device>
RAID_DISK=<slot number>
Usually, you would use blkid to find the devices uuid and issue the
--cluster-confirm command.
If the node does not "see" the device, it must issue (or timeout):
# mdadm --manage /dev/md0 --cluster-confirm 2:missing
References:
[1] mdadm tool changes: https://github.com/goldwynr/mdadm branch:cluster-md
[2] Patches against stable 3.14: https://github.com/goldwynr/linux
branch: cluster-md-devel
[3] A guide to setup a cluster using corosync/pacemaker
https://www.suse.com/documentation/sle-ha-12/singlehtml/book_sleha/book_sleha.html
(Note: The basic concepts of this guide should work with any distro)
--
--
Goldwyn
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html