On Sun, Jun 13, 2021 at 3:56 AM Patrick O'Callaghan <pocallaghan@xxxxxxxxx> wrote: > > On Sun, 2021-06-13 at 07:09 +0800, Ed Greshko wrote: > > On 13/06/2021 06:57, Ed Greshko wrote: > > > But, does your plot show a difference? > > > > Speaking of your plot..... > > > > Don't you think the time between > > > > sys-devices-pci0000:00-0000:00:1a.0-usb1-1\x2d1-1\x2d1.6- > > 1\x2d1.6.2.device and > > dev-disk-by\x2dpath- > > pci\x2d0000:00:14.0\x2dusb\x2d0:3:1.0\x2dscsi\x2d0:0:0:1.device > > > > worth looking into? > > Of course. That's precisely the issue I'm concerned about. I don't see > what's causing it. My working hypothesis is that it's somehow related > to the fact that the external dock supports two drives in a BTRFS RAID1 > configuration and that the kernel is verifying them when it starts up, > even though the drives are not being mounted (they have an automount > unit but nothing in /etc/fstab). > > Why it would delay the rest of the system startup while this is > happening is something I don't understand. The delay is very visible (I > get three dots on a blank screen while it's happening). Short version: Is this Btrfs raid1 listed at all in fstab? If so, add noauto,nofail to the mount options, see if that clears it up. Long version: Dracut handles mdadm array assembly. Normal assembly (non-degraded) is done by dracut using the mdadm command; but if that fails, dracut starts a count down loop, I think 300 seconds, before it tries a degraded assembly. None of this exists for btrfs raid at all in dracut. For one, btrfs raid assembly is combined with mount. The mount command pointed to any of the member devices results in the kernel finding all the member devices automagically. If 1+ member is missing, mount fails. Since systemd only tries to mount one time, and because it's decently likely mounting a multiple device btrfs as /sysroot will fail as a result of one or more devices not yet being ready, there is a udev rule to wait for everyone to get ready: /usr/lib/udev/rules.d/64-btrfs.rules The gotcha is this simple rule waits indefinitely. This udev rule is there to make sure normal (non-degraded) boot doesn't incorrectly fail just because of a 1s delay with one of the devices showing up. But if a drive has actually failed, it results in a hang. Forever. You can add "x.systemd.timeout=300" boot parameter to approximate the rather long dracut wait for mdadm. And at a dracut shell, you can then just: mount -o degraded /dev/sdXY /sysroot exit And away you go. Of course this is non-obvious. And it needs to work better. And it will, eventually. So the next gotcha is if /sysroot is not Btrfs. In this case there's a bug in dracut that prevents this udev rule from being put into the initramfs. That means anything that does try to mount a non-root Btrfs during boot, either fstab or gpt discoverable partitions, might possibly fail if "not all devices are ready" at the time of the mount attempt. https://github.com/dracutdevs/dracut/issues/947 This should be fixed in dracut 055, but if you already have 055 and have an initramfs built with it and this problem you're having is a new problem, maybe we've got a regression in 055 or something? I'm not sure yet...still kinda in the dark on what's going wrong. Also, it is possible it's not related to this btrfs file system at all, but I'm throwing it out there just as something to be aware of. -- Chris Murphy _______________________________________________ users mailing list -- users@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to users-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/ List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/users@xxxxxxxxxxxxxxxxxxxxxxx Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure