Re: Network based (iSCSI) RAID1 setup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 10/05/2017 14:08, Adam Goryachev wrote:
It depends on your definition of production, but for me, the answer is
no. Once upon a time, I used MD to do RAID1 between a local SSD and a
remote device with NBD and that worked well, (apart from the fact I
needed to manually re-add the remote device after a reboot, or whenever
it dropped out for any other reason). It did save me when the local SSD
died, and I was able to keep running purely from the remote NBD device
until I could get in and replace the local SSD.

Today, I use DRBD, and would much prefer that compared to MD + NBD.

Thanks for your feedback, Adam. I agree with your, for the reasons expressed below. Hoping to be useful for others, I'll document here my findings.

I'm using two CentOS 7.3 x86-64 boxes, with kernel version
3.10.0-514.16.1.el7.x86_64 and mdadm v3.4 - 28th January 2016. Here
you can find my current RAID1 setup, where /dev/sdb is the iSCSI disk:

So, second question: how to enable auto re-add for the remote device
when it become available again? For example:

I don't know, but I guess you need to work out what udev rules are
triggered when the iscsi device is "connected", and then get that to
trigger the MD add rules. Possibly you could try to create a partition
on the iscsi, and then use sdb1 for the RAID array, there might be
better handling by udev in that case (I really don't know, just making
random suggestions here).

I worked out how to enable auto re-add: the key was to include a default "POLICY action=re-add" in /etc/mdadm.conf; at this point, any removed disk will be re-attached automatically when it newly become visible.

With a catch: in iSCSI, when the remote disk become unresponsive and it is dropped, it is *not* removed (by udev) from the disk entries found in /dev. As both the remove and the auto-readd processes depends on udev events triggering changes to the /dev directory (ie: a drive disappearing and/or reappearing), rebooting the remote host will cause the iSCSI-imported disk to be marked as failed, but not as removed (due to its entries in /dev being *not* removed); later, when the iSCSI disk become visible again, it is re-added as a spare.

So, while mdadm by itself worked quite well with networked disks, I agree with Adam in that, for production workloads, this is a too fragile setup. Specifically: - the default iSCSI timeout (120 seconds) it way too high and need to be adjusted; - as, by default, mdadm scans all disk devices for valid arrays, *both* the local and remote machines can see the md array. This must be avoided with the using of the ARRAY <ignore> directive in the mdadm.conf file on the remote machines (ie: the one which exports the iSCSI drives); - udev is clearly not geared for managing events caused by remote disks which suddenly disconnect without advice.

In the end, no problems seems related to mdadm itself, which remains a wonderful and extremely flexible tool to manage software RAIDs. Thank you all for the hard works.

Regards.

--
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti@xxxxxxxxxx - info@xxxxxxxxxx
GPG public key ID: FF5F32A8
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux