Hi, On 2010/03/18, at 5:55, Serge E. Hallyn wrote: > Quoting Jiro SEKIBA (jir@xxxxxxxxxxxxxxxxx): >> Hi, >> >> Thank you for prompt reply! >> Sorry that I didn't post to containers@xxxxxxxxxxxxxxxxxxxxxxxxxxx >> >> On 2010/03/16, at 7:55, Oren Laadan wrote: >> >>> Hi, >>> >>> Thanks for taking the time to evaluate c/r. You may want to also >>> try the latest, which is (as of now) ckpt-v20-rc2. >> >> Yeah, I'll eventually try to keep up with the latest, >> but I just want to try the one you think it's stable first anyway. >> >>> In the future, please CC the containers mailing list for issues >>> related to c/r, at "containers@xxxxxxxxxxxxxxxxxxxxxxxxxx". >>> >>> Jiro SEKIBA wrote: >>>> Hi, >>>> I'm trying to evaluate external checkpoint/restart with cr-v19 kernel. >>>> However, when I restart, I got "Killed" message in stdout. >>>> Do you have any tips or clue that are not in >>>> Documentation/checkpoint/usage.txt ? >>>> I'm using kernel pulled from >>>> git://git.ncl.cs.columbia.edu/pub/git/linux-cr.git . >>>> checkout tag named "ckpt-v19". Base distro is ubuntu 9.10. >>>> I ran self checkpioint/restart sample program in Documentation/checkpint. >>>> It works as written in usage.txt. >>>> However, I can not make external checkpint/restart work properly. >>>> I made a simple test program bellow and create checkpoint externally using >>>> the program in Documentation/checkpoint/, it looks checkpoint file is >>>> created properly. >>>> However, when I ran self_restart < ckpt.image, I got "Killed" message. >>> >>> If you take an external checkpoint, then you need to match it >>> with an external restart, as opposed to self_restart. >>> >>> Otherwise, restarting with self_restart from a checkpoint that is >>> not a self-checkpoint can yield unexpected results. >>> >>> Since you don't mention in your post, I don't know if you are using >>> the tools from user-cr. If not, then you should use 'checkpoint' and >>> 'restart' tools from there. It is available from: >>> git://git.ncl.cs.columbia.edu/pub/git/user-cr.git >>> (use the same branch as the one you used to linux-cr). >>> >>> Once you have the tools compiled, and you checkpoint with the >>> 'checkpoint' utility from there, you can restart with: >>> restart -v < ckpt.image >>> >> >> Thank you for the information. >> Actually I was trying to create checkpoint in Document/checkpints. >> >> Now, I tried with user-cr, compiled binary in the same tag (ckpt-v19). >> Creating checkpoint looks OK and restart -v shows it Success. nice! >> However, the contents in /tmp/test.out never get further, >> it remains same as when created checkpoint. >> >> I tried "./restart -F /cgroup/0 -v --no-pidns < ckpt.image", got Success. >> cat /cgroup/0/tasks tells that there is a process. >> ps shows ./test. So, it looks restarting. >> >> # ps axuww |grep $(cat /cgroup/0/tasks ) >> root 7231 0.1 0.0 1588 64 pts/0 D 16:57 0:00 ./test >> root 7238 0.0 0.1 2716 660 pts/1 R+ 16:57 0:00 grep 7231 >> >> under the /proc, one file descriptor opened, and it is /tmp/test.out >> >> # ls -l /proc/$(cat /cgroup/0/tasks)/fd >> total 0 >> lrwx------ 1 root root 64 Mar 16 16:58 0 -> /tmp/test.out >> >> Nhh, it's close.. >> >> I found that when I mount cgroup with -o freezer, self_checkpoint won't work. >> It worked even I didn't mount the cgroup. >> Is it what you expect? > > No, it is not. Can you tell us more about exactly how it fails? > OK, I've checked differences of dmesg when self_restart does well and doesn't. When it goes well, the filename is /tmp/cr-self.out [ 401.522556] [2307:2307:c/r:ckpt_read_fname:571] read filename '/tmp/cr-self.out' [ 401.522558] [2307:2307:c/r:restore_open_fname:594] fname '/tmp/cr-self.out' flags 0x2 However, when the contents of file remains, filename is /tmp/cr-self.out.org, which is , of course, the one of original file binding to the original process. [ 1088.414250] [2951:2951:c/r:ckpt_read_fname:571] read filename '/tmp/cr-self.out.orig' [ 1088.414253] [2951:2951:c/r:restore_open_fname:594] fname '/tmp/cr-self.out.orig' flags 0x2 I can not reproduce yet, but at least cgroup freezer option won't affect like I mentioned. Sorry that it might confuse you. I still can not restart of external checkpoint. I'll try to v20 next time. > Maybe get the cr_tests (either from Oren's tree or from > git clone git://git.sr71.net/~hallyn/cr_tests.git), cd cr_test, > make, cd simple, run ./ckpt and send us the contents of > /tmp/log, dmesg, and ckptinfo -ve /tmp/out ? I think it runs OK, but send it in case. /tmp/log was empty by the way. thanks >> Thank you again for the help! >> I'm feeling better to use the latest .. > > -serge
Attachment:
ckptinfo-ve.log
Description: Binary data
Attachment:
dmesg
Description: Binary data
_______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers