Yes. Snapper on openSUSE is doing this already on Btrfs. I'm not sure how it's dealt with on LVM thinp since /boot has to be outside LVM thinp because while GRUB groks conventional LVM, it doesn't get thinp yet. GRUB does understand /boot on Btrfs, but Fedora's grubby has a problem with it [1]. I've also been making /var/log a separate subvolume making it immune to rootfs snapshots and rollbacks.
Note for OSTree, /var/lib/rpm -> /usr/share/rpm (it's also immutable). Same for /var/lib/yum.
Is there good chance of optimizing OSTree to use LVMthin and Btrfs snapshots instead of hardlinks, while still being in charge of the proper semantic enforcement?
Note OSTree already today uses BTRFS_IOC_CLONE if on btrfs for implementing the separate copies of /etc. (Actually this happens via the generic g_file_copy() since https://git.gnome.org/browse/glib/commit/?id=5eba9784979e0b723c05a45cf767046607e4e759 )
Beyond that though - because for OSTree, /usr is immutable, there isn't really a big advantage of thinp or btrfs snapshots. Just try this right now on your laptop:
# Once for cold cache performance
time cp -al /usr /usr.copy
# And once for hot cache
time cp -al /usr /usr.copy2
For me (and this a real-world RHEL7 system with a 5.1G /usr):
[root@localhost /]# time cp -al usr usr.copy
real 0m5.199s
user 0m0.220s
sys 0m2.849s
[root@localhost /]# time cp -al usr usr.copy2
real 0m2.245s
user 0m0.166s
sys 0m2.049s
That's really fast enough for the use cases I envision, for now. Obviously FS/block snapshots have other advantages beyond being instant - for example, they don't incur lots of scattered writes to bump the refcounts of inodes. But many systems already have that happening periodically to a lesser degree with the default of relatime anyways.
Where FS/block snapshots become *necessary* is if you have *uncontrolled writes* to /usr. For example, with OSTree's hardlink model, I cannot allow arbitrary rpm %post code to run. Each one has to be carefully audited to break hardlinks via "write new copy, rename" instead of doing edits in place.
This is necessary to allow a story for local software installation. We don't need to do it though for the "pure replication" model where *no* RPM %post runs on client systems - it all happens on the build server.
This replication model where OSTree is strongest right now, and where the traditional package model is weakest, so I have been mainly emphasizing it.
That said, doing this careful auditing of RPM %post and in general laying the foundations for a package-like system on top of OSTree is very much in the long term plans.
Yes I also don't consider one kind of "rollback" since there can be different contexts. A user rolling back their /home doesn't mean rolling back any other user's, or the system. Conversely rolling back the system doesn't mean rolling back user /home or logs or some other things.
Definitely. OSTree doesn't touch /home (note this is now /var/home) - and so it makes a lot of sense to still have something that's more like a backup system. Particularly a backup system that knew to take a backup before OSTree upgrades.
That's where using BTRFS or thinp in *combination* with OSTree is really nice - that total freedom to do whatever you want at the block layer means you can choose to have /home (/var/home) on a separate partition and do thinp snapshots of it. Or use BTRFS's per-subvolume RAID to say you want RAID0 for /, and RAID1 for /home.
To answer your question in another way then - I'll definitely be fast to take advantage of any new APIs added by the storage layer to *transparently* make things better for OSTree. But I don't want to mandate any particular partition layout or FS/block level layout, because I think it takes away too much administrator flexibilty.
-- desktop mailing list desktop@xxxxxxxxxxxxxxxxxxxxxxx https://admin.fedoraproject.org/mailman/listinfo/desktop