Re: Installation image layout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Oct 13, 2018 at 8:17 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, Oct 12, 2018 at 5:26 PM, Marek Marczykowski-Górecki
> <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote:
> > On Fri, Oct 12, 2018 at 03:44:38PM -0600, Chris Murphy wrote:
>
> >> mkfs.btrfs has --rootdir and --shrink features to pre-allocate a
> >> volume with files at mkfs time; I have no idea to what degree it
> >> depends on kernel code.
> >
> > Probably not at all, given it works as non-root user too.
> > I've tried to run it twice on the same directory (and with the same
> > --uuid) on 32MB of data and got different images (~2000 lines of hexdump
> > diff). Could be some timestamps, could be something else.
>
> There is volume UUID which is what --uuid affects. But there are other
> uuids, including the chunk uuid which gets repeated in every leaf and
> node along with the volume uuid, device uuid, each files tree
> (subvolume) get its own uuid, etc. Time stamps include atime, otime,
> mtime, and ctime. Some objects have all 0's for uuid, and some items
> have only 0.0 for times. I'll float the reproducibility question on
> the Btrfs list, if it's desirable, useful, and how difficult it is. I
> think subsetting Btrfs features to reduce complexity generally, and
> therefore increase reproducibility as a consequence of that, has
> merit.
>

This is a really interesting idea...

>
> >> It's also
> >> possible with dm-verity or dm-integrity but then that adds back the dm
> >> complexity.
> >
> > Oh, please, no...
>
> Haha...
>

This made me giggle a bit. :)

> >
> > There are two almost separate aspects here:
> >  - image layout (squashfs+ext4, squashfs alone, squashfs+btrfs)
> >  - how copy-on-write is achieved (dm-snapshot, overlay fs)
>
> ext4 alone, and btrfs alone are also viable. But since ext4 has no
> compression, image size grows by maybe a factor of 2. Btrfs supports
> lzo and zlib compression since forever, and zstd since kernel 4.14,
> same as squashfs. What's been missing is mksquashfs with zstd support,
> which I imagine will be in 5.0. The compression ratio compares well
> with xz currently being used by mksquashfs in Fedora composes, but
> with much less CPU to compress and decompress. So I'd say go with zstd
> in any case.
>

squashfs has supported zstd along with btrfs since kernel 4.14. zstd
support was mainlined into squashfs-tools a year ago:
https://github.com/plougher/squashfs-tools/commit/6113361316d5ce5bfdc118d188e5617a1fcd747c

However, there's been no releases since the migration from CVS on SF
to Git on GitHub.

>
> >
> > For reproducibility, squashfs alone is the best option, but does not
> > improve integrity checking (but also doesn't make it worse).
>
> I'm not able to estimate how much work it is to add a files hash
> manifest to squashfs, and to always use it on reads, and then add some
> error handling to EIO upon any mismatch. But yeah it'd need user space
> code in mksquashfs and also kernel code to support it.
>
>
> > As for copy-on-write, dm-snapshot is quite complex to setup and require
> > underlying FS to support write. Also, doesn't allow to write more data
> > than original image size (may be an issue for persistent partition
> > case). Overlay fs on the other hand works with any underlying fs, you
> > can write as much data as you want. And in case of persistent partition,
> > you can access that data even if base image (the lower layer) is
> > unavailable/broken. I think the only downside of overlay fs is when you
> > modify large file it gets copied in full to the upper layer. But I don't
> > think that's an issue in this use case.
> >
> > For me, overlay fs is a clear winner here.
> > But as for image layout, it isn't that simple. For reproducibility,
> > squashfs alone is better. But if the goal of this change would be also
> > improving read errors detection, then it isn't that clear anymore. It
> > may be that it takes a simple mkfs.btrfs patch to make it reproducible,
> > but it isn't obvious for me at this stage. Also, keeping two layers
> > looks like unnecessary complexity.
>
> I agree. Overlayfs works fine with any of the discussed filesystems.
> I'd give a slight edge to Btrfs seed+sprout as the overlay mechanism
> in the case of persistence on a USB stick: a) checksumming b)
> compression helps improve performance of USB flash drives and reduces
> wear c) kernel discovers both seed and sprout in early boot by sprout
> uuid alone, no special mount options needed for setup. But it's a
> really minor point because a) and b) are still possible with overlayfs
> with a new independent btrfs as the upperdir.
>
>
> > What do you think about sidestepping this discussion a little and
> > replacing dm-snapshot with overlay fs regardless of other changes here?
> > That should be doable without any change to image format and will give
> > more flexibility there.
>
> Agreed. What I can't tell you off hand is if livecd-iso-to-disk would
> be affected by this in some way; or whether the change policy applies.
> But I think it's better to file the change so there's awareness and
> coordination: installer team would have to sign off on the pull
> request for lorax, and then releng team probably should know about it
> because they define their own compose settings (I guess they often use
> upstreams defaults but they don't have to), and then QA might want a
> heads up so if things blow up they know who to ask what's up, and then
> it's also a good idea to let SOAS folks know about it. And a central
> point of filing changes is coordination.
>

As the upstream for livecd-tools[1] (and thus livecd-iso-to-disk), I'd
be very interested in changes to support both Btrfs seed+sprout and
Btrfs+OverlayFS combinations.

[1]: https://github.com/livecd-tools/livecd-tools



-- 
真実はいつも一つ!/ Always, there's only one truth!
_______________________________________________
devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx
To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx
Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Fedora Announce]     [Fedora Users]     [Fedora Kernel]     [Fedora Testing]     [Fedora Formulas]     [Fedora PHP Devel]     [Kernel Development]     [Fedora Legacy]     [Fedora Maintainers]     [Fedora Desktop]     [PAM]     [Red Hat Development]     [Gimp]     [Yosemite News]

  Powered by Linux