Re: F29 System Wide Change: Make BootLoaderSpec the default

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Di, 26.06.18 12:51, Chris Murphy (lists@xxxxxxxxxxxxxxxxx) wrote:

> I'm making it about systemd-boot because it is literally the only
> bootloader that can't read any file system that the firmware can't
> already read; it's a UEFI only bootloader and it definitely sounds
> like the spec is written with the bootloader in mind rather than the
> other way around.

Let me stress that I am bringing forward technical reasons to use FAT
that have nothing to do with sd-boot, it's you who wants to make it
about that, I haven't mentioned it before.

> > boot menu after failed boots, instead of reverting back). Now, in that
> > model you need to count the attempted boots somewhere. Thus you need
> > write access somewhere (and no, EFI variables don't work for this,
> > they are not suitable for stuff changed on every boot, as their memory
> > is generally not ready to be written too that often). Which hence
> > means you need write access to some file system in some form from the
> > boot loader. And how do you think that's going to work out if already
> > read access to modern file systems is, well, a desaster?
> 
> I agree the counter use case has merit. Is it a bootloader specific
> feature, or is it supposed to work across bootloaders?

Well, I am actually working on adding boot counting support to
systemd, on all layers, i.e. in systemd-boot to count down, and in
systemd itself to assess whether a boot should be considered "good"
and to tell the boot loader about it for the next cycle. All of this
stuff is fairly generic, i.e. communication between systemd and the
boot loader, and it is intended to be, so that other boot loaders can
reuse as much or as little of this logic for whatever they want to
do. But of course, I myself will only put this together for
systemd-boot, it's up to others to hook things up with grub if there's
interest.

The logic I am working on is built as an add-on to the boot loader
spec: I simply encode the counter in the file name of the snippet, so
that counting down and marking an entry as good is a simple rename
operation, which has a good chance to be implementable atomically even
in simpler file systems (such as fat), as there should not be the need
to allocate or release blocks. It also has the benefit that clean-up
cycles are very clear as no additional data is kept and removing a
snippet implicitly also drops its counting information. Example: if a
boot spec entry would normally be called "foobar.conf" without boot
counting in use, then with boot counting it would be called
"foobar+5.conf" or so (in case it is configured to be tries 5 times
before giving up). It would then be renamed to foobar+4.conf on the
first try, foobar+3.conf, foobar+2.conf, foobar+1.conf and finally if
all attempts failed foobar+0.conf. And a tiny new systemd component
will rename the current entry's name to foobar.conf (i.e. dropping the
counter) on success.

> So are you proposing a BLS variant of the grubenv that all bootloaders
> can share? It does matter, because not all file systems support
> grubenv.

Well, I am proposing a solution based on rename()s above. But I am
fully aware that this might not be an option that convinces everbody,
which is why the stuff is generic on all levels: if you don't want to
use renames, then built your own logic instead, but you can still make
use of systemd's boot completion check logic in that case.

> Let me tell you how totally non-trivial VFAT is for sharing when the
> driver is in firmware. Digital camera vendors have had vfat drivers in
> both consumer and professional cameras for over a decade. The one sure
> way you can corrupt your CF/SD card file system, is transferring it
> between cameras *even of the same make and model with different
> firmware versions* and doing basic file operations like create and
> delete. Boom! Fuck all your files! Hahaha! (Yes the camera maniacally
> laughs in your face as it corrupts the file system.) The manufacturer
> recommendation, even on professional gear? Format the card in-camera
> before each use. Shoot. Do not ever delete files. When you're ready
> suck the images off the card, back them up, put the card back in the
> camera, reformat. If you switch cards to different cameras, reformat
> the card. You can't do that? Expect data loss is possible.

Dunno. We are not talking about digital cameras here. Already for
licensing/patent reasons firmware tends to stick to the intel uefi
fat driver. Which might bad and they might have patched around in it
to make it worse, but I think it's certainly not as bad as you assume
it to be. 

> I've lost count how many times I personally have experienced such data
> loss, with all sorts of consumer and professional gear, let alone the
> number of stories I've heard from professional photographers and from
> camera and SD/CF card engineers.

I did not assume the goal was to run fedora on a digital camera. This
is borderline FUD...

also, your comparison is very flawed, as firmware would either
exclusively do *read* access and no write access atall, or it would be
doing only very minimal write access. I mea, even though i think
considering boot counting as a general feature is highly relevant, and
hence write access is necessary, i also believe write access should
not be done willynilly, but reduced to the minimum and done in
operations that have the greatest chance to be safe. That's why we
decided to do boot counting with simple renames in sd-boot. after all.

But given that the exclusive (or almost exclusive) write side of
things is done from Linux (and not from the boot loader) it's under
our control, and hence can be made as safe as we can. or to say this
differently: it's the crappy firmware fs code that *reads* primarily,
and our Linux code that *writes*. That's systematically different from
the camera word, where it's the crappy device code that *writes*
primarily and we on our Linux PC's mostly only *read* the CF cards...

> *shrug* Look if people do whatever they want contrary to
> documentation, and then take years to still no fix their shit? That is
> not my idea. My proposal has been to revert the commit allowing
> plymouth to be kill exempt by systemd. But it's been a year, over a
> year maybe, since then and it's still not done? So now my proposal is
> something like this: systemd folks, too many people are pissing in the
> swimming pool, you can no longer provide any service a means of being
> kill exempt. Delay the kill if you want, but eventually it must be
> killed, so you can remount-ro, so that the journal is flushed, so that
> the bootloader doesn't fucking faceplant.

Well, the original reason why the concept exists is storage, so that
we leave storage daemons running that are needed for PID1's own
files, i.e. that we don't kill mdmon or a tool like that if the root
dir is on raid. If we'd suddently start killing those, then many
people would very unhappy. I mean, you'd just shift one storage
problem to become a slightly different storage problem.

Also, as I see it, this is entirely between the grub and kernel folks
to figure out, I am pretty happy to stand back on this discussion.

> Really though, the bootloaders each need to do this stupid
> FIFREEZE/FITHAW if they're going to support file systems that refuse
> to sync() correctly. And that's because ultimately it is not systemd's
> responsibility to remount-ro in order to make sure the bootloader's
> changes are properly committed to disk. The GRUB folks want to support
> XFS? Fine. That means either supporting FIFREEZE/FITHAW at grub.cfg
> create time, or they have to teach the bootloader how to read a
> journal. The former is a metric buttton easier than the latter, even
> if it's not an atomic operation.

Well, you can also make it simple and use a file system that doesn't
need all that.

> > Oh, right. this approach already failed with Grub, with it's
> > relatively large commercial support, and now you want pile on?
> 
> Oh please stop. It's not a failed approach. Windows and macOS have
> been doing this reliably for 20 years because, holy crap! they

Well, still, the fs situation with grub is bad, see the xfs mess. This
is not going to be any better if you pick some random fs nobody uses
(such as udf or all the other exotic stuff that was mentioned).

I understand you have some love for niche file systems, but seriously,
a boot process is not a place where you want to try something new and
shiny, but where you stick to the boring stuff you one can manage.

Also, let me stress one thing: windows and macos are in a very
different position: they care only about their respective islands and
they control the hardware to much higher degrees that we. I mean, for
apple it's easy to support hfs+ in the uefi firmware, because their
stuff only needs to run on macs. We do not have that luxury, we need
to work with what we got. We are the ones who have to make multi-boot
work. And thus we should really pick something that is simply, and
understood by everybody and can bridge the gap to other OSes,
firmwares and boot loaders.

> actually commit their bootloader changes before rebooting!
> 
> Now if you want to point fingers at why the bootloader changes aren't
> committed properly before reboot, have at it, I'm totally on board.
> But you don't get to point to one tiny fail and say the whole concept
> is a failure, when there's billions of implementations that do the
> same damn thing correctly on NTFS and JHFS+ and neither of them are
> doing journal replays in their bootloader either. They expect things
> to be properly committed before reboot is even called. And they can do
> this because they aren't doing weird shit like persistently mounting
> their ESPs or boot volumes and expecting it's someone else's problem
> to cleanly unmount them.

Well, as far as I know the xfs mess is not caused by a simple
"bug". It's caused by philisophical differences, that the xfs kernel
folks made clear that the in-kernel xfs implementation is the only one
they care for and that other implementations, such as those in grub or
so are not supportable. I am pretty sure the other general purpose
file system folks aren't too far off in their thinking. FAT bypasses
this philosophical, social problem: the FAT driver is pretty well
supported, but as the format is set pretty much in stone and no new
features are continously added it's relatively easy to support between
multiple peers that all want to support it separately.

> > UDF? When's the last time you actually used that? I mean, I don't even
> > have a DVD drive anymore, where I could find an UDF file system on...
> 
> I use it regularly on large flash media because there's no file size
> limit like vfat, it supports Unix permissions, symlinks, hardlinnks,
> and also isn't proprietary like vfat or exfat. Since ancient times
> it's been intended for random read/write support on flash media and
> even hard drives, it's not a DVD only thing. It has way better cross
> platform support than exFAT, NTFS, or HFS+ - and without the vfat
> limitations.

I must say I haven't seen an UDF file system anywhere in ages. I am
not sure it's really as well supported and universally used as you
suggest.

> > Also, it's read-only afair, hence stuff like boot counting is not
> > going to work, it's a dead end.
> 
> It is not a read only affair. You're confusing UDF with ISO 9660. They
> aren't the same thing. UDF has offered random read write support for
> hard drives before flash drives were even a thing, since its
> inception.
> 
> https://en.wikipedia.org/wiki/Universal_Disk_Format

Hmm, looking at the wikipedia page it appears that support on Linux
has languished. the current versions of udf (2.50, 2.60, from 2003)
are only supported read-only? Is there an active community around udf
on Linux?

Quite frankly udf appears entirely unmaintained... I tried to track
down the location where mkudffs is maintained, but all google finds
are dead sourceforge pages. And links like
"https://www.reddit.com/r/linux/comments/6giwvi/why_isnt_udf_given_more_attentiondeveloped_fully/";,
which don't precisely make me want to trust this.

Again, stick to something supportable. udffs is not it. Don't make a
barely supported fringe fs a core piece of our boot process. I mean,
seriously!

Lennart

-- 
Lennart Poettering, Red Hat
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx/message/FFYUHOFQAP4GSQD54FXKG5COPFGRZO53/




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux