Quoting Raghu D K (dk.raghu@xxxxxxxxx): > Hello All, > > I moved the "#!/bin/sh" to point to "bash" however I still see issues > in used the "git://www.linux-cr.org/pub/git/tests-cr" scripts. > Probably I am missing something with my wrong understanding, I am a > little confused with the usage of user space application "checkpoint" > and "restart" and the applications in the "test-cr" folder. > > I wrote a sample shell script "my-test.sh" and tried the following > without much success. > > #!/bin/sh > # > # > #*********************************************************************************** > > echo "Incrementing variable ..." > COUNT=$1 > X=0 > while [ $X -le $COUNT ]; > do > X=$(( $X + 1 )) > echo "Value of X =" $X > sleep 1 > done > > > $ cd ~/user-cr > $ mount -tcgroup -o freezer cgroup /cgroup > $ mkdir -p /cgroup/1 > $ nsexec -z5000 my-test.sh 100 & > $ echo 5000 > /cgroup/1/tasks > $ echo FROZEN > /cgroup/1/freezer.state > > $ checkpoint 5000 > ckpt.image > > This generated a "ckpt.image" file of size 2594550 bytes > > $ ckptinfo -epv ckpt.image > info: [@8] object 1 HDR_HEADER len 72 > info: [@80] object 4 HDR_BUFFER len 73 > info: [@153] object 4 HDR_BUFFER len 73 > info: [@226] object 4 HDR_BUFFER len 73 > ... > unexpected end of file (read 0 of 8) > > $ kill -9 5000 > $ echo THAWED > /cgroup/1/freezer.state > $ ./restart < ckpt.image > > This one shows error "Bad file discriptor", what I am missing ? First, you can find more information about what went wrong in a few ways: 1. add '-l logfile' arguments to checkpoint and restart commands, to put more debug messages into 'logfile' (which must not yet exist) 2. add '-v' argument to checkpoint and restart for debugging 3. look at /var/log/syslog for lots of error messages, assuming you have CONFIG_CHECKPOINT_DEBUG (or whatever that is called) set in your kernel 4. after doing checkpoint, use 'ckptinfo', which came with the user-cr programs, to analyze the checkpoint image I suspect what happened to you, though, is that you left file descriptors open. If you look at counterloop/crcounter.c in the tests, it does 'for i in (1..100) close(i)'. The problem with not doing this is that the program you are checkpointing has inherited file descriptors from its parent task, and, at restart, it has no way to recreate those. -serge _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers