Re: MD devnode still present after 'remove' udev event, and mdadm reports 'does not appear to be active'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 23 Sep 2011 22:24:08 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
wrote:

> Thank you, Neil, for answering.
> I'm not sure that understand all of this, because my knowledge of
> Linux user-kernel interaction is, unfortunately, not sufficient. In
> the future, I hope to know more.
> For example, I don't understand, how opening a "/dev/mdXX" can create
> a device in the kernel, if the devnode "/dev/mdXX" does not exist. In
> that case, I actually fail to open it with ENOENT.

/dev/mdXX is a "device special file".  It is not the device itself.
You can think of it like a symbolic link.
The "real" name for the device is something like "block device with major 9
and minor X"  That thing can exist quite independently of whether
the /dev/mdXX thing exists.  Just like a file may or may not exist
independently of whether some sym-link to it exists.

When the device (block,9,XX) appears, udev is told and it should create
things in /dev.  when the device disappears, udev is told and it should
remove the /dev entry.  But there can be races, and other things might
sometimes add or remove /dev entries (though they shouldn't).  So the
existence of something in /dev isn't a guarantee that it really exists.


> 
> But what I did is actually similar to what you advised:
> - if I fail to open the devnode with ENOENT, I know (?) that the
> device does not exist
> - otherwise, I do GET_ARRAY_INFO
> - if it returns ok, then I go ahead and do GET_DISK_INFOs to get the
> disks information
> - otherwise if it returns ENODEV, I close the fd and then I read /proc/mdstat
> - if the md is there, then I know it's inactive array (and I have to
> --stop it and reassemble or do incremental assembly)
> - if the md is not there, then I know that it really does not exist
> (this is the case when md deletion happened but the devnode did not
> disappear yet)
> 
> Does it sound right? It passes stress testing pretty well.

Yes, that sounds right.

> 
> By the way, I understand that /proc/mdstat can be only of 4K size...so
> if I have many arrays, I should probably switch to look at
> /sys/block....

Correct.

NeilBrown


> 
> Thanks,
>   Alex.
> 
> 
> 
> 
> 
> 
> On Wed, Sep 21, 2011 at 8:03 AM, NeilBrown <neilb@xxxxxxx> wrote:
> >
> > On Tue, 13 Sep 2011 11:49:12 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
> > wrote:
> >
> > > Hello Neil,
> > > I am sorry for opening this again, but I am convinced now that I don't
> > > understand what's going on:)
> > >
> > > Basically, I see that GET_ARRAY_INFO can also return ENODEV in case
> > > the device in the kernel exists, but "we are not initialized yet":
> > > /* if we are not initialised yet, only ADD_NEW_DISK, STOP_ARRAY,
> > >  * RUN_ARRAY, and GET_ and SET_BITMAP_FILE are allowed */
> > > if ((!mddev->raid_disks && !mddev->external)
> > >     && cmd != ADD_NEW_DISK && cmd != STOP_ARRAY
> > >     && cmd != RUN_ARRAY && cmd != SET_BITMAP_FILE
> > >     && cmd != GET_BITMAP_FILE) {
> > >       err = -ENODEV;
> > >       goto abort_unlock;
> > >
> > > I thought that ENODEV means that the device in the kernel does not
> > > exist, although I am not this familiar with the kernel sources (yet)
> > > to verify that.
> > >
> > > Basically, I just wanted to know whether there is a reliable way to
> > > determine whether the kernel MD device exists or no. (Obviously,
> > > success to open a devnode from user space is not enough).
> > >
> > > Thanks,
> > >   Alex.
> >
> > What exactly do you mean by "the kernel MD device exists" ??
> >
> > When you open a device-special-file for an md device (major == 9) it
> > automatically creates an inactive array.  You can then fill in the details
> > and activate it, or explicitly deactivate it.  If you do that it will
> > disappear.
> >
> > Opening the devnode is enough to check that the device exists, because it
> > creates the device and then you know that it exists.
> > If you want to know if it already exists - whether inactive or not - look
> > in /proc/mdstat or /sys/block/md*.
> > If you want to know if it already exists and is active, look in /proc/mdstat,
> > or open the device and use GET_ARRAY_INFO, or look in /sys/block/md*
> > and look at the device size. or maybe /sys/block/mdXX/md/raid_disks.
> >
> > It depends on why you are asking.
> >
> > NeilBrown
> >
> >
> >
> > >
> > >
> > >
> > >
> > >
> > >
> > >
> > > On Tue, Aug 30, 2011 at 12:25 AM, NeilBrown <neilb@xxxxxxx> wrote:
> > > > On Mon, 29 Aug 2011 20:17:34 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
> > > > wrote:
> > > >
> > > >> Greetings everybody,
> > > >>
> > > >> I issue
> > > >> mdadm --stop /dev/md0
> > > >> and I want to reliably determine that the MD devnode (/dev/md0) is gone.
> > > >> So I look for the udev 'remove' event for that devnode.
> > > >> However, in some cases even after I see the udev event, I issue
> > > >> mdadm --detail /dev/md0
> > > >> and I get:
> > > >> mdadm: md device /dev/md0 does not appear to be active
> > > >>
> > > >> According to Detail.c, this means that mdadm can successfully do
> > > >> open("/dev/md0") and receive a valid fd.
> > > >> But later, when issuing ioctl(fd, GET_ARRAY_INFO) it receives ENODEV
> > > >> from the kernel.
> > > >>
> > > >> Can somebody suggest an explanation for this behavior? Is there a
> > > >> reliable way to know when a MD devnode is gone?
> > > >
> > > > run "udevadm settle" after stopping /dev/md0  is most likely to work.
> > > >
> > > > I suspect that udev removes the node *after* you see the 'remove' event.
> > > > Sometimes so soon after that you don't see the lag - sometimes a bit later.
> > > >
> > > > NeilBrown
> > > >
> > > >>
> > > >> Thanks,
> > > >>   Alex.
> > > >> --
> > > >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > >> the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > >
> > > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux