Re: MD devnode still present after 'remove' udev event, and mdadm reports 'does not appear to be active'

Alexander Lyakas <alex.bolshoy@xxxxxxxxx> · Tue, 11 Oct 2011 15:11:47 +0200

Hello Neil,
can you please confirm for me something?
In case the array is FAILED (when your enough() function returns 0) -
for example, after simultaneous failure of all drives - then the only
option to try to recover such array is to do:
mdadm --stop
and then attempt
mdadm --assemble

correct?

I did not see any other option to recover such array Incremental
assemble doesn't work in that case, it simply adds back the drives as
spares.

Thanks,
  Alex.

On Sun, Sep 25, 2011 at 12:15 PM, NeilBrown <neilb@xxxxxxx> wrote:
> On Fri, 23 Sep 2011 22:24:08 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
> wrote:
>
>> Thank you, Neil, for answering.
>> I'm not sure that understand all of this, because my knowledge of
>> Linux user-kernel interaction is, unfortunately, not sufficient. In
>> the future, I hope to know more.
>> For example, I don't understand, how opening a "/dev/mdXX" can create
>> a device in the kernel, if the devnode "/dev/mdXX" does not exist. In
>> that case, I actually fail to open it with ENOENT.
>
> /dev/mdXX is a "device special file".  It is not the device itself.
> You can think of it like a symbolic link.
> The "real" name for the device is something like "block device with major 9
> and minor X"  That thing can exist quite independently of whether
> the /dev/mdXX thing exists.  Just like a file may or may not exist
> independently of whether some sym-link to it exists.
>
> When the device (block,9,XX) appears, udev is told and it should create
> things in /dev.  when the device disappears, udev is told and it should
> remove the /dev entry.  But there can be races, and other things might
> sometimes add or remove /dev entries (though they shouldn't).  So the
> existence of something in /dev isn't a guarantee that it really exists.
>
>
>>
>> But what I did is actually similar to what you advised:
>> - if I fail to open the devnode with ENOENT, I know (?) that the
>> device does not exist
>> - otherwise, I do GET_ARRAY_INFO
>> - if it returns ok, then I go ahead and do GET_DISK_INFOs to get the
>> disks information
>> - otherwise if it returns ENODEV, I close the fd and then I read /proc/mdstat
>> - if the md is there, then I know it's inactive array (and I have to
>> --stop it and reassemble or do incremental assembly)
>> - if the md is not there, then I know that it really does not exist
>> (this is the case when md deletion happened but the devnode did not
>> disappear yet)
>>
>> Does it sound right? It passes stress testing pretty well.
>
> Yes, that sounds right.
>
>>
>> By the way, I understand that /proc/mdstat can be only of 4K size...so
>> if I have many arrays, I should probably switch to look at
>> /sys/block....
>
> Correct.
>
> NeilBrown
>
>
>>
>> Thanks,
>>   Alex.
>>
>>
>>
>>
>>
>>
>> On Wed, Sep 21, 2011 at 8:03 AM, NeilBrown <neilb@xxxxxxx> wrote:
>> >
>> > On Tue, 13 Sep 2011 11:49:12 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
>> > wrote:
>> >
>> > > Hello Neil,
>> > > I am sorry for opening this again, but I am convinced now that I don't
>> > > understand what's going on:)
>> > >
>> > > Basically, I see that GET_ARRAY_INFO can also return ENODEV in case
>> > > the device in the kernel exists, but "we are not initialized yet":
>> > > /* if we are not initialised yet, only ADD_NEW_DISK, STOP_ARRAY,
>> > >  * RUN_ARRAY, and GET_ and SET_BITMAP_FILE are allowed */
>> > > if ((!mddev->raid_disks && !mddev->external)
>> > >     && cmd != ADD_NEW_DISK && cmd != STOP_ARRAY
>> > >     && cmd != RUN_ARRAY && cmd != SET_BITMAP_FILE
>> > >     && cmd != GET_BITMAP_FILE) {
>> > >       err = -ENODEV;
>> > >       goto abort_unlock;
>> > >
>> > > I thought that ENODEV means that the device in the kernel does not
>> > > exist, although I am not this familiar with the kernel sources (yet)
>> > > to verify that.
>> > >
>> > > Basically, I just wanted to know whether there is a reliable way to
>> > > determine whether the kernel MD device exists or no. (Obviously,
>> > > success to open a devnode from user space is not enough).
>> > >
>> > > Thanks,
>> > >   Alex.
>> >
>> > What exactly do you mean by "the kernel MD device exists" ??
>> >
>> > When you open a device-special-file for an md device (major == 9) it
>> > automatically creates an inactive array.  You can then fill in the details
>> > and activate it, or explicitly deactivate it.  If you do that it will
>> > disappear.
>> >
>> > Opening the devnode is enough to check that the device exists, because it
>> > creates the device and then you know that it exists.
>> > If you want to know if it already exists - whether inactive or not - look
>> > in /proc/mdstat or /sys/block/md*.
>> > If you want to know if it already exists and is active, look in /proc/mdstat,
>> > or open the device and use GET_ARRAY_INFO, or look in /sys/block/md*
>> > and look at the device size. or maybe /sys/block/mdXX/md/raid_disks.
>> >
>> > It depends on why you are asking.
>> >
>> > NeilBrown
>> >
>> >
>> >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > >
>> > > On Tue, Aug 30, 2011 at 12:25 AM, NeilBrown <neilb@xxxxxxx> wrote:
>> > > > On Mon, 29 Aug 2011 20:17:34 +0300 Alexander Lyakas <alex.bolshoy@xxxxxxxxx>
>> > > > wrote:
>> > > >
>> > > >> Greetings everybody,
>> > > >>
>> > > >> I issue
>> > > >> mdadm --stop /dev/md0
>> > > >> and I want to reliably determine that the MD devnode (/dev/md0) is gone.
>> > > >> So I look for the udev 'remove' event for that devnode.
>> > > >> However, in some cases even after I see the udev event, I issue
>> > > >> mdadm --detail /dev/md0
>> > > >> and I get:
>> > > >> mdadm: md device /dev/md0 does not appear to be active
>> > > >>
>> > > >> According to Detail.c, this means that mdadm can successfully do
>> > > >> open("/dev/md0") and receive a valid fd.
>> > > >> But later, when issuing ioctl(fd, GET_ARRAY_INFO) it receives ENODEV
>> > > >> from the kernel.
>> > > >>
>> > > >> Can somebody suggest an explanation for this behavior? Is there a
>> > > >> reliable way to know when a MD devnode is gone?
>> > > >
>> > > > run "udevadm settle" after stopping /dev/md0  is most likely to work.
>> > > >
>> > > > I suspect that udev removes the node *after* you see the 'remove' event.
>> > > > Sometimes so soon after that you don't see the lag - sometimes a bit later.
>> > > >
>> > > > NeilBrown
>> > > >
>> > > >>
>> > > >> Thanks,
>> > > >>   Alex.
>> > > >> --
>> > > >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> > > >> the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > > >> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> > > >
>> > > >
>> > > --
>> > > To unsubscribe from this list: send the line "unsubscribe linux-raid" in
>> > > the body of a message to majordomo@xxxxxxxxxxxxxxx
>> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
>> >
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html