Re: Long wait for start job

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jun 13, 2021 at 3:56 AM Patrick O'Callaghan
<pocallaghan@xxxxxxxxx> wrote:
>
> On Sun, 2021-06-13 at 07:09 +0800, Ed Greshko wrote:
> > On 13/06/2021 06:57, Ed Greshko wrote:
> > > But, does your plot show a difference?
> >
> > Speaking of your plot.....
> >
> > Don't you think the time between
> >
> > sys-devices-pci0000:00-0000:00:1a.0-usb1-1\x2d1-1\x2d1.6-
> > 1\x2d1.6.2.device and
> > dev-disk-by\x2dpath-
> > pci\x2d0000:00:14.0\x2dusb\x2d0:3:1.0\x2dscsi\x2d0:0:0:1.device
> >
> > worth looking into?
>
> Of course. That's precisely the issue I'm concerned about. I don't see
> what's causing it. My working hypothesis is that it's somehow related
> to the fact that the external dock supports two drives in a BTRFS RAID1
> configuration and that the kernel is verifying them when it starts up,
> even though the drives are not being mounted (they have an automount
> unit but nothing in /etc/fstab).
>
> Why it would delay the rest of the system startup while this is
> happening is something I don't understand. The delay is very visible (I
> get three dots on a blank screen while it's happening).

Short version:
Is this Btrfs raid1 listed at all in fstab? If so, add noauto,nofail
to the mount options, see if that clears it up.

Long version:
Dracut handles mdadm array assembly. Normal assembly (non-degraded) is
done by dracut using the mdadm command; but if that fails, dracut
starts a count down loop, I think 300 seconds, before it tries a
degraded assembly. None of this exists for btrfs raid at all in
dracut. For one, btrfs raid assembly is combined with mount. The mount
command pointed to any of the member devices results in the kernel
finding all the member devices automagically. If 1+ member is missing,
mount fails. Since systemd only tries to mount one time, and because
it's decently likely mounting a multiple device btrfs as /sysroot will
fail as a result of one or more devices not yet being ready, there is
a udev rule to wait for everyone to get ready:

/usr/lib/udev/rules.d/64-btrfs.rules

The gotcha is this simple rule waits indefinitely. This udev rule is
there to make sure normal (non-degraded) boot doesn't incorrectly fail
just because of a 1s delay with one of the devices showing up. But if
a drive has actually failed, it results in a hang. Forever. You can
add "x.systemd.timeout=300" boot parameter to approximate the rather
long dracut wait for mdadm. And at a dracut shell, you can then just:

mount -o degraded /dev/sdXY /sysroot
exit

And away you go. Of course this is non-obvious. And it needs to work
better. And it will, eventually.

So the next gotcha is if /sysroot is not Btrfs. In this case there's a
bug in dracut that prevents this udev rule from being put into the
initramfs. That means anything that does try to mount a non-root Btrfs
during boot, either fstab or gpt discoverable partitions, might
possibly fail if "not all devices are ready" at the time of the mount
attempt.

https://github.com/dracutdevs/dracut/issues/947

This should be fixed in dracut 055, but if you already have 055 and
have an initramfs built with it and this problem you're having is a
new problem, maybe we've got a regression in 055 or something? I'm not
sure yet...still kinda in the dark on what's going wrong.

Also, it is possible it's not related to this btrfs file system at
all, but I'm throwing it out there just as something to be aware of.


-- 
Chris Murphy
_______________________________________________
users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure



[Index of Archives]     [Older Fedora Users]     [Fedora Announce]     [Fedora Package Announce]     [EPEL Announce]     [EPEL Devel]     [Fedora Magazine]     [Fedora Summer Coding]     [Fedora Laptop]     [Fedora Cloud]     [Fedora Advisory Board]     [Fedora Education]     [Fedora Security]     [Fedora Scitech]     [Fedora Robotics]     [Fedora Infrastructure]     [Fedora Websites]     [Anaconda Devel]     [Fedora Devel Java]     [Fedora Desktop]     [Fedora Fonts]     [Fedora Marketing]     [Fedora Management Tools]     [Fedora Mentors]     [Fedora Package Review]     [Fedora R Devel]     [Fedora PHP Devel]     [Kickstart]     [Fedora Music]     [Fedora Packaging]     [Fedora SELinux]     [Fedora Legal]     [Fedora Kernel]     [Fedora OCaml]     [Coolkey]     [Virtualization Tools]     [ET Management Tools]     [Yum Users]     [Yosemite News]     [Gnome Users]     [KDE Users]     [Fedora Art]     [Fedora Docs]     [Fedora Sparc]     [Libvirt Users]     [Fedora ARM]

  Powered by Linux