Re: [systemd-devel] Errorneous detection of degraded array

NeilBrown <neilb@xxxxxxxx> · Mon, 30 Jan 2017 17:36:51 +1100

On Mon, Jan 30 2017, Andrei Borzenkov wrote:

> 30.01.2017 04:53, NeilBrown пишет:
>> On Fri, Jan 27 2017, Andrei Borzenkov wrote:
>> 
>>> 26.01.2017 21:02, Luke Pyzowski пишет:
>>>> Hello,
>>>> I have a large RAID6 device with 24 local drives on CentOS7.3. Randomly (around 50% of the time) systemd will unmount my RAID device thinking it is degraded after the mdadm-last-resort@.timer expires, however the device is working normally by all accounts, and I can immediately mount it manually upon boot completion. In the logs below /share is the RAID device. I can increase the timer in /usr/lib/systemd/system/mdadm-last-resort@.timer from 30 to 60 seconds, but this problem can randomly still occur.
>>>>
>>>> systemd[1]: Created slice system-mdadm\x2dlast\x2dresort.slice.
>>>> systemd[1]: Starting system-mdadm\x2dlast\x2dresort.slice.
>>>> systemd[1]: Starting Activate md array even though degraded...
>>>> systemd[1]: Stopped target Local File Systems.
>>>> systemd[1]: Stopping Local File Systems.
>>>> systemd[1]: Unmounting /share...
>>>> systemd[1]: Stopped (with error) /dev/md0.
>> 
>> This line perplexes me.
>> 
>> The last-resort.service (and .timer) files have a Conflict= directive
>> against sys-devices-virtual-block-md$DEV.device 
>> Normally a Conflicts= directive means that if this service starts, that
>> one is stopped, and if that one starts, this is stopped.
>> However .device units cannot be stopped:
>> 
>> $ systemctl show sys-devices-virtual-block-md0.device | grep Can
>> CanStart=no
>> CanStop=no
>> CanReload=no
>> CanIsolate=no
>> 
>> so presumable the attempt to stop the device fails, so the Conflict=
>> dependency cannot be met, so the last-resort service (or timer) doesn't
>> get started.
>
> As I explained in other mail, to me it looks like last-resort timer does
> get started, and then last-resort service is started which attempts to
> stop device and because mount point depends on device it also stops
> mount point. So somehow we have bad timing when both device and timer
> start without canceling each other.
>
> The fact that stopping of device itself fails is irrelevant here -
> dependencies are evaluated at the time job is submitted, so if
> share.mount Requires dev-md0.device and you attempt to Stop
> dev-md0.device, systemd still queues job to Stop share.mount.
>
>> At least, that is what I see happening in my tests.
>> 
>
> Yes, we have race condition here, I cannot reproduce this either. It
> does not mean it does not exist :) Let's hope debug logging will show
> something more useful (it is entirely possible that with debugging logs
> turned on this race does not happen).
>
>> But your log doesn't mention sys-devices-virtual-block-md0, it
>> mentions /dev/md0.
>> How does systemd know about /dev/md0, or the connection it has with
>> sys-devices-virtual-block-md0 ??
>> 
>
> By virtue of "Following" attribute. dev-md0.device is Following
> sys-devices-virtual-block-md0.device so stopping the latter will also
> stop the former.

Ahh.. I see why I never saw this now.
Two reasons.
 1/ My /etc/fstab has UUID=d1711227-c9fa-4883-a904-7cd7a3eb865c rather
    than /dev/md0
    systemd doesn't manage to intuit a 'Following' dependency between
    the UUID and the mount point.
 2/ I use partitions of md arrays: that UUID is actually /dev/md0p3.
    systemd doesn't intuit that md0p3.device is Following md0.device.

So you only hit a problem if you have "/dev/md0" or similar in
/etc/fstab.

The race is, I think, that one I mentioned.  If the md device is started
before udev tells systemd to start the timer, the Conflicts dependencies
goes the "wrong" way and stops the wrong thing.

It would be nice to be able to reliably stop the timer when the device
starts, without risking having the device get stopped when the timer
starts, but I don't think we can reliably do that.

Changing the
  Conflicts=sys-devices-virtual-block-%i.device
lines to
  ConditionPathExists=/sys/devices/virtual/block/%i
might make the problem go away, without any negative consequences.

The primary purpose of having the 'Conflicts' directives was so that
systemd wouldn't log
  Starting Activate md array even though degraded
after the array was successfully started.
Hopefully it won't do that when the Condition fails.

Thanks,
NeilBrown

>
>> Does
>>   systemctl  list-dependencies  sys-devices-virtual-block-md0.device
>> 
>> report anything interesting?  I get
>> 
>> sys-devices-virtual-block-md0.device
>> ● └─mdmonitor.service
>> 
Attachment:
signature.asc

Description: PGP signature