On Fri, 5 Feb 2010, Matt Helsley wrote: > On Fri, Feb 05, 2010 at 08:04:12PM -0500, Oren Laadan wrote: > > > > > > Serge E. Hallyn wrote: > > > Quoting Oren Laadan (orenl@xxxxxxxxxxxxxxx): > > >> > > >> Serge E. Hallyn wrote: > > >>> Quoting Oren Laadan (orenl@xxxxxxxxxxxxxxx): > > >>>> Serge E. Hallyn wrote: > > >>>>> Quoting Oren Laadan (orenl@xxxxxxxxxxxxxxx): > > >>>>>> Cool ! > > >>>>>> > > >>>>>> So what do we have working now for 64 bit kernel (for 32 bit kernel > > >>>>>> we know it works...): > > >>>>>> > > >>>>>> 'restart' checkpointed > > >>>>>> program program > > >>>>>> ---------------------------------------- > > >>>>>> 64bit 64bit -> works > > >>>>>> 32bit 32bit -> works > > >>>>>> > > >>>>>> 64bit 32bit -> ????? > > > > > > s/?????/Rejected/ > > > > > > CKPT_ARCH_ID is of course different for X86_32 than X86_64, so > > > we refuse restart in restore_read_header(). > > > > > > -serge > > > > > > > lol ... that's actually funny ! > > > > Anyway, in light of the IRC discussions, here are the cases again: > > > > > > original original restart target > > program kernel program kernel > > -------- --------- -------- -------- > > 64 bit 64 bit 64 bit 64 bit [0] works > > > > 32 bit 32 bit 32 bit 32 bit [0] works > > 32 bit 64 bit 32 bit 64 bit [0] works > > > > 32 bit 32 bit 32 bit 64 bit [1] > > 32 bit 64 bit 32 bit 32 bit [1] > > > > 32 bit any 64 bit 64 bit [2] > > 64 bit 64 bit 32 bit 64 bit [2] > > > > [0] The first 3 cases are "homogeneous", with conditions equal at > > checkpoint and restart. AFAIK, they work. > > > > [1] The next two cases consider 32 bit program, and vary only the > > environment - the kernel may change from 32 to 64 or back. We want > > them to work. > > > > IIUC, your comment above means that they don't work because the > > CKPT_ARCH_ID is a mismatch. The fix should be trivial - either > > make 'restart' modify it, or make the kernel tolerate it. > > > > [2] The last two cases consider the case when the restart program > > itself has different bit-ness than the checkpointed program (and > > transition may occur in either direction). While lower priority, > > we would like this to work, too. > > Great table. Is it posted in the ckpt wiki too? > > http://ckpt.wiki.kernel.org > > I could take care of that for you if not. Perhaps it belongs under > the "Checklist"? Yep, I'll add it. > > The question is whether the transition 64 -> 32 (or 32 ->64) from > > the 'restart' program to the restarting task should happen in the > > kernel as part of sys_restart(), or in user space using an execve() > > syscall before calling sys_restart(). > > The recent exec bug while switching personalities highlights the value, > in my opinion, of keeping these transitions out of the restart syscall. > There's great potential for nasty, long-term bugs in any code that > deals with those kinds of switches. Keeping that code "in one place" is > the best way to avoid adding similar bugs. I agree with the concern. There is even a stronger argument: doing it in user-space via exec will rid the need to repeat the kernel code for different archs. The only caveat is that restarting mixes 32- and 64-bit programs will be slower because fo the exec calls. Then again, it can be optimized when someone complains (and provides kernel code for this). > > > Doing so in user space is not trivial when threads are involved, > > since the exec must then happen before the creation of threads (or > > it will kill them). This will complicate the implementation of the > > MakeForest() algorithm which relies on all all descendents seeing > > the same data structures. > > True -- MakeForest is already rather complicated. > > As for seeing the same data structures across exec, perhaps we should > keep an fd open across exec and read/map the table from that. That means > converting from struct task* to indices in the table for one thing. I > have some RFC patches for that. It also means the table contents have to > use the same layout between 32 and 64-bit -- also quite easy. > > What I couldn't see was a good place to do the exec itself. Can you send me the patches that you already have ? There are a couple of more details: * Checkpoint needs to also record the bit-ness of each process in the tasks table * On restart, if all tasks have same bit-ness, then one exec at most is needed. * Where to place the exec ? ideally for process (without threads) it would happen right before sys_restart(), but for processes that have threads it should happen right before the fork(). * Passing an fd is a good idea - anonymous shared memory should do the trick. * I'm a bit concerned about security, because 'restart' will likely be a setuid utility: 1) need to ensure that we are exec'ing the right program (e.g. use the /proc/self/exe link) 2) a new switch to 'restart' will say "use fd N for the data", but a user may provide arbitrary data - can they do harm ? we'll definitely need to be more cautious in handling the tasks array because in this case we don't construct it ourselves. Oren. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers