>> the self checkpoint and self restore syscalls, like Oren is proposing, are >> simpler but they require the process cooperation to be triggered. we could >> image doing that in a special signal handler which would allow us to jump >> in the right task context. > > This description is not accurate: > > For checkpoint, both implementations use an "external" task to read the state > from other tasks. (In my implementation that "other" task can be self). which is good, since some applications want to checkpoint themselves and that's a way to provide them a generic service. > For restart, both implementation expect the restarting process to restore its > own state. They differ in that Andrew's patchset also creates that process > while mine (at the moment) relies on the existing (self) task. hmm, It seems that your patchset relies on the fact that the tasks are checkpointed and restarted at a syscall boundary. right ? I'm might be completely wrong on that :) > In other words, none of them will require any cooperation on part of the > checkpointed tasks, and both will require cooperation on part of the restarting > tasks (the latter is easy since we create and fully control these tasks). yes. >> I don't have any preference but looking at the code of the different patchsets >> there are some tricky areas and I'm wondering which path is easier, safer, >> and portable. > > I am thinking which path is preferred: create the processes in kernel space > (like Andrew's patch does) or in user space (like Zap does). In the mini-summit > we agreed in favor of kernel space, but I can still see arguments why user space > may be better. I'm more familiar with the second algorithm, restarting the process tree in user space and let each task restart itself with the sys_restart syscall. But that's because I've been working on a C/R framework which freezes tasks on a syscall boundary, which makes a developer's life easy for restart. But as you know, a restarted process resumes its execution where it was checkpointed. So i'm wondering what are the hidden issues with a in-kernel checkpoint and in-kernel restart. To be more precise, why Andrey needs a i386_ret_from_resume trampoline in : http://lkml.org/lkml/2008/9/3/181 and why don't you ? > (note: I refer strictly to the creation of the processes during restart, not > how their state is restored). OK > any thoughts ? thanks Oren, C. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers