mdctl - a lifesaver

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




Neil,

You have probably seen my earlier posts about running 128 drives
in 8 raids...and beefing up the md driver a little..

Raidtools (0.90) works fine for it's intended use as does "autoconfig"..

However, when running LOTS of drives (e.g. 128 or more), split into
multiple MD devices, some things learned the hard way.

1) should have used "devfs" from the start!

Dev naming in enigma (Redhat 7.2), for SCSI, just doles out
(as SCSI's are probed) /dev/sda, sdb, etc... sdz, then after
that sdaa, sdab, etc.. our last one (128th drive) was /dev/sddx).

All is fine and well.. created raids with raidtools (mkraid, and
use raidstart to run them), until some drives are added or removed
from the middle of the probed space.. Then all the device names
and minor's (and sometimes majors) change from the point of
device addition or removal.  What a pain with raidtools.

Raidstart, only uses the first device in the config file
to pass to the md driver, and the md driver reads in 
it's superblock, to get the major/minors of the rest of the 
pieces of the array..  After the first entry, raidstart ignores
the devices in /etc/raidconf after the first.  If your devices
(names and major/minors) move, you are pretty much SOL.

mdctl to the rescue.  Thanks Neil!

mdctl --assemble /dev/mxX  /dev/one, two, etc.. real lifesaver,
will start the list of devices from the command line, with the
"uuid=XXXXXXXXXX" insurance, so wrong devs are not picked up by mistake.
It turns out that once, started, the kernel MD driver writes new 
major/minors to the superblocks of the devices.

By using devfs, the device names do not move around when drives
are added or removed from the middle, a great "safety" factor
when you have a bunch of raid stacks.. Only move if a controller
is added/removed in the middle, and then mdctl handles that.

Only (minor) downside, is that the "real" dev names with devfs
are kind of long (like /dev/scsi/host4/bus0/target4/lun7/part1),
but you have symlinks such as /dev/sd/c4b0t4u7p1 for the above.
However, /proc/mdstat prints the LONG names,now takes 5 lines on
a 140 char wide screen for a 16 drive raid + 2 spares, where
mdstat used to fit on one line per raid..

This makes a shell file to feed args to mdctl for startup a snap:

mdctl --assemble /dev/mdX --uuid=XXXXXXXXXXXX /dev/sd/c?b0t?u4p1
for instance to grab a bunch of drives.
(I got this idea from Frank Samuelson's recent post on this list,
but he used the very long pathnames instead)

----
Now, for Neil's "wish list", it sure would be nice to be able to:
mdctl --assemble /dev/mdX --uuid=XXXXXXXXXXXX /dev/sd/c*p1

However, that expands into all 128 drives (we have on at the moment),
and we have 16 + 2 spares at the specified UUID.  The way that
mdctl parses the dev list args is to put them into an array
sized MD_SB_DRIVES (which is 27 in the current kernel)... and 

this limit is hit.  I could not just fake out mdctl by increasing
MD_SB_DRIVES to 128 or other big number, since it looks at 
raid superblocks that need to have MD_SB_DRIVES to be 27.

Looks like it will take some work to take more dev args than
MD_SB_DRIVES, even though after examining all the SB,s
there are still <= MD_SB_DISKS that match UUID.  Also 
what about the case where there are MD_SB_DISKS of "raid"
drives and then more "spare" drives defined?  No big deal,
am very glad that Neil and mdctl came long!
---

Also, what is the state of the art in array monitoring

(mdctl --monitor), to watch for drives failing, and to
to a raidhotadd from a single spares pool to multiple
md devices.  Has Neil or anyone else expanded on
mdctl --monitor or I once heard that there were commercial
monitoring programs that worked with Linux MD raids?
Any comments before I delve into it?  thanks in advance

--ghg
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux