Re: /sys/block/md126 still exists even after stopping the array

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 07 Oct 2014 09:05:43 +0200 Francis Moreau <francis.moro@xxxxxxxxx>
wrote:

> Hi Neil,
> 
> On 09/30/2014 09:43 AM, Francis Moreau wrote:
> > Hi Neil,
> > 
> > On 09/29/2014 11:56 PM, NeilBrown wrote:
> >> On Mon, 29 Sep 2014 10:45:17 +0200 Francis Moreau <francis.moro@xxxxxxxxx>
> >> wrote:
> >>
> >>>> So what were pids 930 and 459?
> >>>> One was presumably the "mdadm -Ss"  - probably 930.
> >>>> Is 459 the "mdadm --monitor" ??  That might be useful hint.
> >>>>
> >>>
> >>> yes.
> >>>
> >>> [456] is:  /sbin/mdadm --monitor --scan --daemonise --syslog
> >>> --pid-file=/run/mdadm/mdadm.pid
> >>>
> >>> and [930] is 'mdamd -Ss'.
> >>
> >> Good.  Please try the patch below.
> >>
> > 
> > After applying your patch, this is what I'm getting in syslog:
> > 
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> > Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> > [970]
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [972]
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by mdadm [970]
> > Sep 30 03:40:07 localhost kernel: md_release(): md125 released by mdadm
> > [972]
> > Sep 30 03:40:07 localhost kernel: md_open(): md125 opened by
> > systemd-udevd [971]
> > Sep 30 03:40:07 localhost systemd[1]: Cannot add dependency job for unit
> > mdmonitor-takeover.service, ignoring: Invalid argument
> > Sep 30 03:40:07 localhost systemd[1]: Started Software RAID monitoring
> > and management.
> > Sep 30 03:40:07 localhost kernel: md_release(): md125 released by
> > systemd-udevd [971]
> > Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> > on md device /dev/md125
> > Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> > on md device /dev/md126
> > Sep 30 03:40:08 localhost mdadm[466]: DeviceDisappeared event detected
> > on md device /dev/md127
> > Sep 30 03:40:08 localhost kernel: md125: detected capacity change from
> > 1863254016 to 0
> > Sep 30 03:40:08 localhost kernel: md: md125 stopped.
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdc3>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc3)
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdb3>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb3)
> > Sep 30 03:40:08 localhost kernel: md_release(): md125 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [466]
> > Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> > [466]
> > Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md126 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md126: detected capacity change from
> > 67043328 to 0
> > Sep 30 03:40:08 localhost kernel: md: md126 stopped.
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdc1>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc1)
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdb1>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb1)
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [466]
> > Sep 30 03:40:08 localhost kernel: md_release(): md126 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [970]
> > Sep 30 03:40:08 localhost kernel: md_open(): md127 opened by mdadm [970]
> > Sep 30 03:40:08 localhost kernel: md127: detected capacity change from
> > 214564864 to 0
> > Sep 30 03:40:08 localhost kernel: md: md127 stopped.
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdc2>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdc2)
> > Sep 30 03:40:08 localhost kernel: md: unbind<vdb2>
> > Sep 30 03:40:08 localhost kernel: md: export_rdev(vdb2)
> > Sep 30 03:40:08 localhost kernel: md_release(): md127 released by mdadm
> > [970]
> > 
> > The ghost device is no more present so your patch seems to have fixed my
> > issue. But I must admit I don't really understand what's going on :-/
> > 
> 
> Since those 'ghost' devices are expected from the MD implementation
> point of view, I'm wondering how am I supposed to detect them or maybe
> how an application is supposed to recognized online arrays.

If your application is looking in /proc/mdstat, then the "ghost" devices will
be either "inactive" or not present at all.
If your application is looking in /sys/block/md*, then the "ghost" devices
will have "clear" or "inactive" in /sys/block/mdXX/md/array_state.

If you use the new "CREATE names=yes" line in mdadm.conf (mdadm 3.3 or
later), and use kernel 3.17 or later, and use names rather than numbers to
identify your arrays (/dev/md/home, /dev/md_root), then the "ghost" problem
will be gone, and names in /proc/mdstat will be e.g. "md_home", or "md_root"
rather than "md4" or "md127".

> 
> My application uses udev to detect et to get information about new
> devices. I don't think the information exported by udev is enough to
> figure this out. Also please note that since I rely on udev, I can't
> really read information on /sys since this information may be out of
> sync with the one returned by udev.

If udev reports that an array exists, then it really did exist when udev got
the message.  By the time your program gets run by udev, it might not exist
any more.
i.e. udev is always racy.
You should always treat any event from udev as a hint: 

  "Something happened to this device in the recent past.  Lots of other
  things might have happened since.  The device might not exist any more, or
  it might have been replaced with a completely different device.  So you
  might want to do something, or you might not, but whatever you do - be
  careful and don't blame me if things go wrong 'cause I'm just the
  messenger."

NeilBrown



Attachment: pgp9Xi8ZUFi6P.pgp
Description: OpenPGP digital signature


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux