Re: F29 System Wide Change: Make BootLoaderSpec the default

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jun 25, 2018 at 4:40 AM, Lennart Poettering
<mzerqung@xxxxxxxxxxx> wrote:

> I am not sure why you are making this about systemd-boot. Let's just
> focus on why (or why not) vfat is the best option for $BOOT.

I'm making it about systemd-boot because it is literally the only
bootloader that can't read any file system that the firmware can't
already read; it's a UEFI only bootloader and it definitely sounds
like the spec is written with the bootloader in mind rather than the
other way around.

VFAT: No links, labels, or xattr. Like I said earlier, UDF does
support them, and has almost as much crossplatform kernel and
bootloader support as FAT, and is as simple a volume format as FAT. If
there is a way to get atomic updates on vfat without symlink, maybe
vfat could be used for $BOOT. See below.

>> > Which file system do you have in mind even for this?
>>
>> Unspecified for now. i.e. no change. It would remain ext4 by default I
>> expect, but ultimately whatever anaconda allows.
>
> So think about this one bit ahead. Right now it's clear that even with
> Grub's relatively large contributor base it'shard to impossible to
> support modern Linux file systems properly — even just for
> read-only. See the the XFS debacle as one example, and that the kernel
> folks made clear they only consider their own in-kernel implementation
> to be supportable. Now, I'd assume that sooner or later features such
> as boot counting are something we want for Fedora too (i.e. that we
> can update the kernel, try to boot the new one a couple of times, and
> when it fails all the time revert back to an older version, fully
> automatically; in fact the fedora desktop have very recently started
> work on that, though they have a weaker model of simply showing the
> boot menu after failed boots, instead of reverting back). Now, in that
> model you need to count the attempted boots somewhere. Thus you need
> write access somewhere (and no, EFI variables don't work for this,
> they are not suitable for stuff changed on every boot, as their memory
> is generally not ready to be written too that often). Which hence
> means you need write access to some file system in some form from the
> boot loader. And how do you think that's going to work out if already
> read access to modern file systems is, well, a desaster?


I agree the counter use case has merit. Is it a bootloader specific
feature, or is it supposed to work across bootloaders?

GRUB folks absolutely proscribe any file system writes. There is one
state file, the grubenv. This is created through the file system
'grub-editenv create' and is used by GRUB by reading the file system
to determine the LBA's for that file. Any writes by GRUB to grubenv
happen outside the file system driver, directly to an LBA, so they're
atomic (or at least as atomic as a drive can support). This matters
because creating a new file has no guarantee any of the multiple
writes required when going through the file system will actually
happen, which could leave the file system dirty and needing repair -
in addition to failing the meet the requirement for your use case
which is a reliable way of counting boots *in the face of unknown boot
failure*.

So are you proposing a BLS variant of the grubenv that all bootloaders
can share? It does matter, because not all file systems support
grubenv.

And it also matters because this same state file could be used for
atomically switching the default boot entry. e.g. rpm-ostree depends
on a symlink to switch default boot, so perhaps this could be done by
modifying a flag in this static file, outside of the file system.

> Again, FAT is the one thing everyone can agree on. Boot loaders can
> read it *and write it*, UEFI and raspberry pi firmwares have support
> for it, and all OSes and their initrds generally too.

Bootloader absolutely do not write to any file system including FAT.
And GRUB's grubenv permits storing limited state information on file
systems other than vfat, so vfat still isn't required.

Let me tell you how totally non-trivial VFAT is for sharing when the
driver is in firmware. Digital camera vendors have had vfat drivers in
both consumer and professional cameras for over a decade. The one sure
way you can corrupt your CF/SD card file system, is transferring it
between cameras *even of the same make and model with different
firmware versions* and doing basic file operations like create and
delete. Boom! Fuck all your files! Hahaha! (Yes the camera maniacally
laughs in your face as it corrupts the file system.) The manufacturer
recommendation, even on professional gear? Format the card in-camera
before each use. Shoot. Do not ever delete files. When you're ready
suck the images off the card, back them up, put the card back in the
camera, reformat. If you switch cards to different cameras, reformat
the card. You can't do that? Expect data loss is possible.



> From the Linux side we can provide relatively safe read and write
> suppport for FAT. For example, if Fedora would use the systemd
> automount logic for mounting $BOOT then the file system will generally
> be unmounted, except in a small time window around actual
> accesses. This means the chance that the file system remains in a
> clean state is maxmized.
>
> $BOOT is a place to place very few files, with very simple access
> patterns. Basically, during update cycles we just add a few files
> there and remove some others, and they are written in one linear write
> operation. For doing that we need no fancy file system features. The
> simplest, most common file system storing files ist good enough for
> that.


I've lost count how many times I personally have experienced such data
loss, with all sorts of consumer and professional gear, let alone the
number of stories I've heard from professional photographers and from
camera and SD/CF card engineers.

There is no possible way you can convince me it's reliable for either
firmware or bootloader to do even simple file operations on vfat. I've
had way too much experience to the contrary. It's such a basic and
expected thing, that it's a fundamental process taught to
photographers: you can find it any technical book and class on the
subject. Sure it's fucked up. It's also true.

Writing some limited hints that can fit in a single 512 byte sector,
is perhaps plausible, where we're writing that directly to a single
LBA outside of the file system driver.


>> This problem has many little saboteurs acting together to betray the
>> user. It isn't really any one single thing, they all have to happen to
>> capsize the ship.
>
> So what are you proposing? Are you going to work on the XFS driver in
> grub to make it match the kernel's current version? And for ext4 too?
> I mean, good luck with that...

For that specific problem, I've already said plymouth is doing the
wrong damned thing totally contrary to rather clear systemd
documentation. And that's why systemd fails to remount-ro which is why
the journal isn't flushed which is why the bootloader doesn't see the
changes (or alternatively sometimes it sees part of the changes in the
form of a zero length grub.cfg).

*shrug* Look if people do whatever they want contrary to
documentation, and then take years to still no fix their shit? That is
not my idea. My proposal has been to revert the commit allowing
plymouth to be kill exempt by systemd. But it's been a year, over a
year maybe, since then and it's still not done? So now my proposal is
something like this: systemd folks, too many people are pissing in the
swimming pool, you can no longer provide any service a means of being
kill exempt. Delay the kill if you want, but eventually it must be
killed, so you can remount-ro, so that the journal is flushed, so that
the bootloader doesn't fucking faceplant.

Really though, the bootloaders each need to do this stupid
FIFREEZE/FITHAW if they're going to support file systems that refuse
to sync() correctly. And that's because ultimately it is not systemd's
responsibility to remount-ro in order to make sure the bootloader's
changes are properly committed to disk. The GRUB folks want to support
XFS? Fine. That means either supporting FIFREEZE/FITHAW at grub.cfg
create time, or they have to teach the bootloader how to read a
journal. The former is a metric buttton easier than the latter, even
if it's not an atomic operation.




>
>> > Why not just stick to VFAT? As mentioned, it's really the only thing
>> > generally understood by everything that has a stake in boot
>> > loading. Grub speaks it. The EFI firmware speaks it (and that also
>> > means the EFI shell, which is immensly useful). Linux speaks it in the
>> > initrd and after boot. Windows speaks it. MacOS speaks it. It's the
>> > lowest common denominator and should be entirely sufficient to store a
>> > few kernels and their initrds. I mean, we build our kernels as EFI
>> > binaries on Fedora, IIRC. Wouldn't it be a pity if EFI can't actually
>> > access them, because they are stored on an fs only Linux speaks?
>>
>> Wouldn't it be a pity if we didn't teach UEFI to read every goddamn
>> file system ever invented just because we can?!
>>
>> http://efi.akeo.ie/downloads/efifs-latest/x64/
>> https://github.com/pbatard/efifs
>
> Oh, right. this approach already failed with Grub, with it's
> relatively large commercial support, and now you want pile on?

Oh please stop. It's not a failed approach. Windows and macOS have
been doing this reliably for 20 years because, holy crap! they
actually commit their bootloader changes before rebooting!

Now if you want to point fingers at why the bootloader changes aren't
committed properly before reboot, have at it, I'm totally on board.
But you don't get to point to one tiny fail and say the whole concept
is a failure, when there's billions of implementations that do the
same damn thing correctly on NTFS and JHFS+ and neither of them are
doing journal replays in their bootloader either. They expect things
to be properly committed before reboot is even called. And they can do
this because they aren't doing weird shit like persistently mounting
their ESPs or boot volumes and expecting it's someone else's problem
to cleanly unmount them.


>
>> I mean honestly, we can teach EFI whatever the hell we want. File
>> system support does not need to be baked into the bootloader on UEFI.
>> Drop these guys onto your ESP and now the firmware with any bootloader
>> can read any of those file systems directly. Pick one.
>>
>> I have to defer to others on the value of symlinks for atomic
>> configuration swapout, but if you want the most widely supported file
>> system that also has symlink support, it's UDF. For the time being
>> though, the concept of a widely sharable $BOOT really doesn't have
>> enough momentum or backing.
>
> UDF? When's the last time you actually used that? I mean, I don't even
> have a DVD drive anymore, where I could find an UDF file system on...

I use it regularly on large flash media because there's no file size
limit like vfat, it supports Unix permissions, symlinks, hardlinnks,
and also isn't proprietary like vfat or exfat. Since ancient times
it's been intended for random read/write support on flash media and
even hard drives, it's not a DVD only thing. It has way better cross
platform support than exFAT, NTFS, or HFS+ - and without the vfat
limitations.

> Also, it's read-only afair, hence stuff like boot counting is not
> going to work, it's a dead end.

It is not a read only affair. You're confusing UDF with ISO 9660. They
aren't the same thing. UDF has offered random read write support for
hard drives before flash drives were even a thing, since its
inception.

https://en.wikipedia.org/wiki/Universal_Disk_Format

-- 
Chris Murphy
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/message/2DRUKAMERLSLYNUTJPZ4WTL2BLOTTZUO/




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux