Hi Amir, On Tue, Apr 11, 2017 at 01:37:53PM +0300, Amir Goldstein wrote: > On Mon, Apr 10, 2017 at 5:20 PM, Tycho Andersen <tycho@xxxxxxxxxx> wrote: > > Hi Amir, > > > > On Sat, Apr 08, 2017 at 09:35:01PM +0200, Amir Goldstein wrote: > >> [moving this discussion over from fsdevel to containers list and > >> changing the title] > >> > >> On Tue, Apr 4, 2017 at 9:07 PM, Tycho Andersen <tycho@xxxxxxxxxx> wrote: > >> > On Tue, Apr 04, 2017 at 09:59:16PM +0300, Amir Goldstein wrote: > >> >> On Tue, Apr 4, 2017 at 9:01 PM, Tycho Andersen <tycho@xxxxxxxxxx> wrote: > >> >> > On Tue, Apr 04, 2017 at 12:47:52PM -0500, Serge E. Hallyn wrote: > >> >> >> > Would lxc-snapshot gain anything from the ability to fsfreeze an overlay > >> >> >> > mount? > >> >> >> > >> >> >> lxc-snapshot only works on stopped containers. 'lxc snapshot' can do live > >> >> >> snapshots using criu. Tycho, does that do anything right now to freeze the > >> >> >> fs? > >> >> > > >> >> > Not that I'm aware of (CRIU might, but we don't in liblxc). > >> >> > > >> >> >> I'm not sure that freezing all the tasks is necessarily enough to settle > >> >> >> the fs, but I assume you're doing something about that already? > >> >> > > >> >> > I suspect it's not, but we're not doing anything besides freezing the > >> >> > tasks. In fact, we freeze the tasks by using the freezer cgroup, > >> >> > which itself is buggy, since the freezer cgroup can race with various > >> >> > filesystems. So, freezing tasks is hard, and I haven't even thought > >> >> > about how to freeze the fs for real :) > >> >> > > >> >> > But in any case, an fs freezing primitive does sound useful for > >> >> > checkpoint restore, assuming that we're right and freezing the tasks > >> >> > is simply not enough. > >> >> > > >> >> > >> >> So I already asked Pavel that question and he said that freezing > >> >> the tasks is enough. I am not convinced it is really enough to bring > >> >> a file system image (i.e. underlying blockdev) to a quiescent state, > >> >> but I think it may be enough for getting a stable view of the mounted > >> >> file system, so the files could be dumped somewhere. > >> >> I am guessing is what lxc snapshot does? > >> > > >> > Yes, lxc snapshot is basically just a frontend for CRIU. > >> > > >> >> I still didn't understand wrt lxc snapshot, is there a use case for > >> >> taking live snapshots without using CRIU? (because freezer cgroup > >> >> mentioned races or whatnot?). > >> > > >> > No, I think CRIU is the only project that will ever attempt to do > >> > checkpoint restore this way ;-). > >> > >> I don't doubt that. > >> > >> My question is whether it is interesting to snapshot a live container fs > >> without having to checkpoint not restore at all. > >> > >> > CRIU supports two different ways of > >> > freezing tasks: one using the freezer cgroup and one without. The one > >> > without doesn't work against fork bombs very well, and the one with > >> > doesn't work because of some filesystems. So it's mostly a container > >> > engine implementation choice which to use. > >> > > >> >> It's definitely possible with btrfs and if my overlayfs freeze patches > >> >> are not terribly wrong, then it should be easy with overlayfs as well. > >> >> Does lxc snapshot already support live snapshot of btrfs container? > >> > > >> > Yes, it does. It freezes the tasks via the cgroup freezer and then > >> > does a btrfs snapshot of the filesystem once the tasks are frozen. > >> > > >> > >> So what I am not sure is if there are use cases where criu cannot be > >> used or maybe there are reasons not to use it. and for these cases > >> if it may be interesting to support snapshot of the storage by: > >> - fsfreeze -f > >> - copy upper dir > >> - fsfreeze -u > > > > I don't see a reason for it, but perhaps I'm not being very > > imaginative. Without the memory state, the potentially inconsistent fs > > state doesn't seem very helpful. > > > > Hi Tycho, > > The use case is quite simple really. > Same use case as any LVM snapshot and btrfs snapshot on a > non-containerized system: > Before installing some stuff, sync, take a snapshot of the root fs and > you can always > restart your system from that snapshot of root fs if something went wrong. > > You don't need to save any memory state for that and you don't need to dump any > processes info for that. > It's simply a snapshot that you can *start* from and not *resume* from. > > I am quite surprised to learn that containers don't have that > functionality (they don't?). > I guess it may be because containers CAN freeze processes, so they do it, > but it's really not a prerequisite for live *image* snapshot - > fsfreeze is enough. Well, the problem is when some container has some state in memory that it hasn't tried to commit to disk yet. Doing an fsfreeze on a running container doesn't seem safe in the general case. Of course offline (i.e. the container is not currently running) freezes are safe and in wide use today, I was speaking only of online freezes. > The thing is it is easy to snapshot container image based on LVM and btrfs today > (lvm snapshot command does fsfreeze on the file system on top of lvm volume), > but it is not possible to snapshot container image based on overlayfs > the same way. > > My patches implement fsfreeze for overlayfs, and quite frankly, I am > taken by surprise, > that container users don't find this useful. I may be missing something. I don't think you are. Container engines today use the snapshotting features of LVM, btrfs (and zfs) for offline freezes (and indeed, features like `btrfs send` and online snapshots to speed up live migration). Cheers, Tycho _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linuxfoundation.org/mailman/listinfo/containers