On Thu, 09 Oct 2014 11:40:25 +0200 Francis Moreau <francis.moro@xxxxxxxxx> wrote: > On 10/08/2014 01:54 AM, NeilBrown wrote: > > On Tue, 07 Oct 2014 09:05:43 +0200 Francis Moreau <francis.moro@xxxxxxxxx> > > wrote: > > > >> Hi Neil, > >> > >> On 09/30/2014 09:43 AM, Francis Moreau wrote: > >>> Hi Neil, > >>> > >>> On 09/29/2014 11:56 PM, NeilBrown wrote: > >>>> On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@xxxxxxxxx> > >>>> wrote: > >>>> > >>>>>> So what were pids 930 and 459? > >>>>>> One was presumably the "mdadm -Ss" - probably 930. > >>>>>> Is 459 the "mdadm --monitor" ?? That might be useful hint. > >>>>>> > >>>>> > >>>>> yes. > >>>>> > >>>>> [456] is: /sbin/mdadm --monitor --scan --daemonise --syslog > >>>>> --pid-file=/run/mdadm/mdadm.pid > >>>>> > >>>>> and [930] is 'mdamd -Ss'. > >>>> > >>>> Good. Please try the patch below. > >>>> > >>> > >>> After applying your patch, this is what I'm getting in syslog: > >>> > >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970] > >>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm > >>> [970] > >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972] > >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970] > >>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm > >>> [972] > >>> Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by > >>> systemd-udevd [971] > >>> Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit > >>> mdmonitor-takeover.service, ignoring: Invalid argument > >>> Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring > >>> and management. > >>> Sep 30 03:40:07 localhost kernel: md_release(): md125 released by > >>> systemd-udevd [971] > >>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected > >>> on md device /dev/md125 > >>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected > >>> on md device /dev/md126 > >>> Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected > >>> on md device /dev/md127 > >>> Sep 30 03:40:08 localhost kernel: md125: detected capacity change from > >>> 1863254016 to 0 > >>> Sep 30 03:40:08 localhost kernel: md: md125 stopped. > >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc3> > >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3) > >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb3> > >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3) > >>> Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm > >>> [970] > >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466] > >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm > >>> [466] > >>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466] > >>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm > >>> [466] > >>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970] > >>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm > >>> [970] > >>> Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970] > >>> Sep 30 03:40:08 localhost kernel: md126: detected capacity change from > >>> 67043328 to 0 > >>> Sep 30 03:40:08 localhost kernel: md: md126 stopped. > >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc1> > >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1) > >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb1> > >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1) > >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466] > >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm > >>> [466] > >>> Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm > >>> [970] > >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970] > >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm > >>> [970] > >>> Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970] > >>> Sep 30 03:40:08 localhost kernel: md127: detected capacity change from > >>> 214564864 to 0 > >>> Sep 30 03:40:08 localhost kernel: md: md127 stopped. > >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdc2> > >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2) > >>> Sep 30 03:40:08 localhost kernel: md: unbind<vdb2> > >>> Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2) > >>> Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm > >>> [970] > >>> > >>> The ghost device is no more present so your patch seems to have fixed my > >>> issue. But I must admit I don't really understand what's going on :-/ > >>> > >> > >> Since those 'ghost' devices are expected from the MD implementation > >> point of view, I'm wondering how am I supposed to detect them or maybe > >> how an application is supposed to recognized online arrays. > > > > If your application is looking in /proc/mdstat, then the "ghost" devices will > > be either "inactive" or not present at all. > > If your application is looking in /sys/block/md*, then the "ghost" devices > > will have "clear" or "inactive" in /sys/block/mdXX/md/array_state. > > > > If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or > > later), and use kernel 3.17 or later, and use names rather than numbers to > > identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem > > will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root" > > rather than "md4" or "md127". > > > >> > >> My application uses udev to detect et to get information about new > >> devices. I don't think the information exported by udev is enough to > >> figure this out. Also please note that since I rely on udev, I can't > >> really read information on /sys since this information may be out of > >> sync with the one returned by udev. > > > > If udev reports that an array exists, then it really did exist when udev got > > the message. By the time your program gets run by udev, it might not exist > > any more. i.e. udev is always racy. > > Yes, but reading sysfs is also racy. I was thinking that the advantage > of using udev is that it gives me a *consistent* (perhaps outdated) > snapshot of the device state. > In what sense do you think sysfs is racy? What exactly do you want to do with the udev event? The event from the kernel to udev only contains the identity of the device and the type of event (add,change,remove). Some drivers add extra 'environment' information. - 'dm' adds a 'cookie'. - bcache add a 'CACHED_UUID' and 'CACHED_LABEL' - libata-acpi adds a 'BAY_EVENT' but in general there is nothing extra. If udev adds stuff (which is probably does), it is just as racy as anything that you might determine and add yourself. NeilBrown
Attachment:
pgp_WT_RvQW8Q.pgp
Description: OpenPGP digital signature