On Sat, Oct 13, 2018 at 8:17 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote: > > On Fri, Oct 12, 2018 at 5:26 PM, Marek Marczykowski-Górecki > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > On Fri, Oct 12, 2018 at 03:44:38PM -0600, Chris Murphy wrote: > > >> mkfs.btrfs has --rootdir and --shrink features to pre-allocate a > >> volume with files at mkfs time; I have no idea to what degree it > >> depends on kernel code. > > > > Probably not at all, given it works as non-root user too. > > I've tried to run it twice on the same directory (and with the same > > --uuid) on 32MB of data and got different images (~2000 lines of hexdump > > diff). Could be some timestamps, could be something else. > > There is volume UUID which is what --uuid affects. But there are other > uuids, including the chunk uuid which gets repeated in every leaf and > node along with the volume uuid, device uuid, each files tree > (subvolume) get its own uuid, etc. Time stamps include atime, otime, > mtime, and ctime. Some objects have all 0's for uuid, and some items > have only 0.0 for times. I'll float the reproducibility question on > the Btrfs list, if it's desirable, useful, and how difficult it is. I > think subsetting Btrfs features to reduce complexity generally, and > therefore increase reproducibility as a consequence of that, has > merit. > This is a really interesting idea... > > >> It's also > >> possible with dm-verity or dm-integrity but then that adds back the dm > >> complexity. > > > > Oh, please, no... > > Haha... > This made me giggle a bit. :) > > > > There are two almost separate aspects here: > > - image layout (squashfs+ext4, squashfs alone, squashfs+btrfs) > > - how copy-on-write is achieved (dm-snapshot, overlay fs) > > ext4 alone, and btrfs alone are also viable. But since ext4 has no > compression, image size grows by maybe a factor of 2. Btrfs supports > lzo and zlib compression since forever, and zstd since kernel 4.14, > same as squashfs. What's been missing is mksquashfs with zstd support, > which I imagine will be in 5.0. The compression ratio compares well > with xz currently being used by mksquashfs in Fedora composes, but > with much less CPU to compress and decompress. So I'd say go with zstd > in any case. > squashfs has supported zstd along with btrfs since kernel 4.14. zstd support was mainlined into squashfs-tools a year ago: https://github.com/plougher/squashfs-tools/commit/6113361316d5ce5bfdc118d188e5617a1fcd747c However, there's been no releases since the migration from CVS on SF to Git on GitHub. > > > > > For reproducibility, squashfs alone is the best option, but does not > > improve integrity checking (but also doesn't make it worse). > > I'm not able to estimate how much work it is to add a files hash > manifest to squashfs, and to always use it on reads, and then add some > error handling to EIO upon any mismatch. But yeah it'd need user space > code in mksquashfs and also kernel code to support it. > > > > As for copy-on-write, dm-snapshot is quite complex to setup and require > > underlying FS to support write. Also, doesn't allow to write more data > > than original image size (may be an issue for persistent partition > > case). Overlay fs on the other hand works with any underlying fs, you > > can write as much data as you want. And in case of persistent partition, > > you can access that data even if base image (the lower layer) is > > unavailable/broken. I think the only downside of overlay fs is when you > > modify large file it gets copied in full to the upper layer. But I don't > > think that's an issue in this use case. > > > > For me, overlay fs is a clear winner here. > > But as for image layout, it isn't that simple. For reproducibility, > > squashfs alone is better. But if the goal of this change would be also > > improving read errors detection, then it isn't that clear anymore. It > > may be that it takes a simple mkfs.btrfs patch to make it reproducible, > > but it isn't obvious for me at this stage. Also, keeping two layers > > looks like unnecessary complexity. > > I agree. Overlayfs works fine with any of the discussed filesystems. > I'd give a slight edge to Btrfs seed+sprout as the overlay mechanism > in the case of persistence on a USB stick: a) checksumming b) > compression helps improve performance of USB flash drives and reduces > wear c) kernel discovers both seed and sprout in early boot by sprout > uuid alone, no special mount options needed for setup. But it's a > really minor point because a) and b) are still possible with overlayfs > with a new independent btrfs as the upperdir. > > > > What do you think about sidestepping this discussion a little and > > replacing dm-snapshot with overlay fs regardless of other changes here? > > That should be doable without any change to image format and will give > > more flexibility there. > > Agreed. What I can't tell you off hand is if livecd-iso-to-disk would > be affected by this in some way; or whether the change policy applies. > But I think it's better to file the change so there's awareness and > coordination: installer team would have to sign off on the pull > request for lorax, and then releng team probably should know about it > because they define their own compose settings (I guess they often use > upstreams defaults but they don't have to), and then QA might want a > heads up so if things blow up they know who to ask what's up, and then > it's also a good idea to let SOAS folks know about it. And a central > point of filing changes is coordination. > As the upstream for livecd-tools[1] (and thus livecd-iso-to-disk), I'd be very interested in changes to support both Btrfs seed+sprout and Btrfs+OverlayFS combinations. [1]: https://github.com/livecd-tools/livecd-tools -- 真実はいつも一つ!/ Always, there's only one truth! _______________________________________________ devel mailing list -- devel@xxxxxxxxxxxxxxxxxxxxxxx To unsubscribe send an email to devel-leave@xxxxxxxxxxxxxxxxxxxxxxx Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines List Archives: https://lists.fedoraproject.org/archives/list/devel@xxxxxxxxxxxxxxxxxxxxxxx