RE: RFC - device names and mdadm with some reference to udev.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> -----Original Message-----
> From: linux-raid-owner@xxxxxxxxxxxxxxx [mailto:linux-raid-
> owner@xxxxxxxxxxxxxxx] On Behalf Of Kay Sievers
> Sent: Monday, October 27, 2008 7:42 AM
> To: Neil Brown
> Cc: linux-raid@xxxxxxxxxxxxxxx; Doug Ledford; martin f. krafft; Michal
> Marek
> Subject: Re: RFC - device names and mdadm with some reference to udev.
> 
> On Sun, Oct 26, 2008 at 23:56, Neil Brown <neilb@xxxxxxx> wrote:
> >  Device naming in mdadm is a bit of a mess.
> 
> >  In 2.6.28, partitioned devices (mdp) wont be needed any more as md
> >  will make use of the "extended partition" functionality recently
> >  added.
> 
> You mean the extended minor space, right? Or the extended partitions,
> which are a format in a msdos table?
> 
> >  1/ The only device nodes created will be /dev/mdX and /dev/md_dX
> >    along with partitions /dev/mdXpY and /dev/md_dXpY as appropriate.
> >    These will be created by mdadm in accordance with the "--auto"
> >    flag unless something in mdadm.conf says to leave it to udev.
> >    In that case, mdadm will create a temporary node
> >    (/dev/.mdadm.whatever) and remove it once udev has created the
> >    real thing.
> 
> Sounds fine, if mdadm needs a device node. It could also wait for udev
> to have the node created, but having a temporary node sounds fine, as
> long as it will not clash with anything udev is creating.
> 
> >  2/ There will be various symlinks to these devices.
> >    a/ if "symlinks=yes" is given in mdadm.conf, symlinks from
> >         /dev/md/X or /dev/md/dX will be created.
> >    b/ if udev is configured like on Debian,
> >              /dev/disk/by-id/md-name-XXXX
> >        and   /dev/disk/by-id/md-uuid-UUUU
> >       will be created (by udev).
> 
> Yes, almost all distros have that.
> 
> >    I'm contemplating creating a link based on the metadata type with
> >    a sequential number. e.g. /dev/md/ddf1 or /dev/md/imsm2.
> >    I'm not sure if there should be in /dev/md/ or directly in /dev/.
> >    I'm also not sure if I should leave the creation to udev, and
> >    whether I should use a small sequential number, or just whatever
> >    number was allocated as the minor number of the device.
> 
> There is intentionally no support for enumeration in udev, it will
> just not work and such numbers/links are not reproducible in hotplug
> environments, and therefore totally useless, and do much more harm
> than good.
> 
> Nothing must ever depend on enumeration, or minor numbers, if these
> properties can not made persistent, attached  to the device itself, so
> that it will always show up with the same number forever. Better do
> not even start such an idea, and leave the kernel name as the primary
> "random" number, instead of creating new randomness on top.
> 
> >  4/ When we stop an array, mdadm will remove anything from /dev that
> >    it probably created.
> 
> Sure, but only if mdadm has it created.
> 
> >    In particular, it will remove the device node as described in 1,
> >    any partitions, and any symlinks in /dev or /dev/md which point to
> >    any of those.  I need to be certain that this won't confuse udev.
> 
> You must never touch anything that udev has created. It must be driven
> by kernel "add/remove/change" events.
> 
> >  1/ People want auto-assembly.  I've always fought against it (we
> >    don't auto-mount all filesystems do we?).
> 
> Some systems do automount all devices. Most systems do only hotplug
> devices which are not listed in /etc/fstab. Expect in the future that
> there will always be auto-assembly and also auto-mounting to some
> degree. All the newer storage buses, like iSCSI and such will always
> need  auto-mounting on device discovery, and not work with any
> bootup-script logic.
> 
> >    But it is a loosing
> >    battle.  And on a modern desktop, when you plug in a new drive the
> >    filesystem is automatically mounted.  So my argument is falling
> >    apart.
> 
> Yes, we will need to support that as a common setup.
> 
> > I'm also wondering if I should include a udev 'rules' file for md in
> > the mdadm distribution.  Obviously it would be no more than a
> > recommendation, but it might give me a voice in guiding how udev
> > interacted with mdadm.
> 
> Definitely, it should carry a udev rules file which instructs udev to
> create all intended symlinks and also supports the raid auto-assembly
> setup. It should not mount anything by default though.
> 
> I'm happy, to see you working on next-generation mdadm. I like to see
> a better integration with udev, and especially, if mdadm detects a
> running udev, not to mess around in /dev in any way, but leave the
> names in /dev to instructions in udev rules. Temporary nodes are fine,
> as long as they don't conflict with anything else, and get removed
> after they are not needed anymore. All updates to symlinks and such
> should be done by "change" events from the kernel, which instructs
> udev to update all the links, and not by touching anything in /dev
> from mdadm.
> 
> Do you think mdadm will stay a program only, called by udev/the user,
> or will a port of its functionality live in a daemon?
> 
> Thanks,
> Kay
> --

I am with Kay here, never force automount.  
I put that right up there with the bonehead MSFT rule of trying to write
signatures on disk drives once they appear.  

Furthermore, don't just delete /dev/md names.   That would be even a greater
mistake.  LINUX today has storage on SANs, clustering, multi-tasking, multi-pathing,
SAN-management/monitoring software that will be using device paths that you want to
delete.  

I can't think of a simple fix, but can think of a complicated fix to make this play
nice in such environments, when things are good .. and when things go bad.   My outside-
the-box suggestion is to present md target devices as a SCSI RAID controller or
processor device where you use ANSI-defined sense keys/ASC values to allow 
apps that are running remotely or even
locally to query immediate state.   If the md device is broken, then report the same sense
information not ready, spun down, whatever ... that a physical disk would report for 
various partition.
More importantly, use EVPD Inquiry and log pages to query configuration information, of
both the /dev/md device, AND all of the partitions, along with health and anything else.
Enterprise management software wouldn't have to log into the LINUX host and run custom
scripts to see what is going on. Use mode sense to send control/configuration change requests.

The ANSI provides a mechanism and options for defining a unique naming convention, and you can
even add a UUID in the format you want as a Vendor-specific layout.   There is already a foundation
for such work due to the iSCSI logic, but obviously much more work is required.

Yes, this not a simple & easy fix, but if you want to future-proof everything and make LINUX storage
easy to integrate into heterogeneous environments, then let ANSI be your guide.
David

��.n��������+%������w��{.n�����{����w��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux