Serge E. Hallyn wrote: > > Here: > > From 8cf006a1bf26a4b280841401302c99689d629e0a Mon Sep 17 00:00:00 2001 > From: Serge E. Hallyn <serue@xxxxxxxxxx> > Date: Thu, 1 Oct 2009 11:09:40 -0400 > Subject: [PATCH 1/1] restart debug: add final process tree status (v2) > > Have tasks in sys_restart keep some status in a list off > of checkpoint_ctx, and print this info when the checkpoint_ctx > is freed. > > This version is mainly just ported against ckpt-v18-hallyn. > > Sample output: > > [3519:2:c/r:free_per_task_status:207] 3 tasks registered, nr_tasks was 0 nr_total 0 > [3519:2:c/r:free_per_task_status:210] active pid was 1, ctx->errno 0 > [3519:2:c/r:free_per_task_status:212] kflags 6 uflags 0 oflags 1 > [3519:2:c/r:free_per_task_status:214] task 0 to run was 2 > [3519:2:c/r:free_per_task_status:217] pid 3517 > [3519:2:c/r:free_per_task_status:219] it was coordinator > [3519:2:c/r:free_per_task_status:227] it was running > [3519:2:c/r:free_per_task_status:217] pid 3519 > [3519:2:c/r:free_per_task_status:223] it was the root task > [3519:2:c/r:free_per_task_status:229] it was a normal task > [3519:2:c/r:free_per_task_status:217] pid 3520 > [3519:2:c/r:free_per_task_status:221] it was a ghost > > Signed-off-by: Serge E. Hallyn <serue@xxxxxxxxxx> Looks good.. I'll massage it a bit and add. Meanwhile, a couple of questions: [...] > --- > checkpoint/restart.c | 106 ++++++++++++++++++++++++++++++++++++++ > checkpoint/sys.c | 57 ++++++++++++++++++++ > include/linux/checkpoint_types.h | 20 +++++++ > 3 files changed, 183 insertions(+), 0 deletions(-) > > diff --git a/checkpoint/restart.c b/checkpoint/restart.c > index b12c8bd..1f356c0 100644 > --- a/checkpoint/restart.c > +++ b/checkpoint/restart.c > @@ -26,6 +26,98 @@ > #include <linux/checkpoint.h> > #include <linux/checkpoint_hdr.h> > > +#ifdef CONFIG_CHECKPOINT_DEBUG > +static struct ckpt_task_status *ckpt_debug_checkin(struct ckpt_ctx *ctx) > +{ > + struct ckpt_task_status *s; > + s = kmalloc(sizeof(*s), GFP_KERNEL); > + if (!s) > + return NULL; > + s->pid = current->pid; > + s->error = 0; > + s->flags = RESTART_DBG_WAITING; > + if (current == ctx->root_task) > + s->flags |= RESTART_DBG_ROOT; > + list_add_tail(&s->list, &ctx->per_task_status); > + return s; > +} The logic would be a bit simpler if you allow check-in to fail (and then fail the restart) - you then don't need to test for validity of @s everywhere. > + > +static struct ckpt_task_status *getme(struct ckpt_ctx *ctx) > +{ > + struct ckpt_task_status *s = NULL; > + list_for_each_entry(s, &ctx->per_task_status, list) { > + if (s->pid == current->pid) > + break; > + } > + if (!s || s->pid != current->pid) > + return NULL; Note that here @s is never NULL. [...] > @@ -680,11 +772,17 @@ static int do_ghost_task(void) > if (IS_ERR(ctx)) > return PTR_ERR(ctx); > > + ckpt_debug_ghost(ctx); > + > + ckpt_debug_log_running(ctx); > + > current->flags |= PF_RESTARTING; > > ret = wait_event_interruptible(ctx->ghostq, > all_tasks_activated(ctx) || > ckpt_test_ctx_error(ctx)); > + > + ckpt_debug_log_error(ctx, 0); Did you mean s/0/ret/ ? [...] > + list_for_each_entry_safe(s, p, &ctx->per_task_status, list) { > + ckpt_debug("pid %d\n", s->pid); > + if (s->flags & RESTART_DBG_COORD) > + ckpt_debug("it was coordinator\n"); > + if (s->flags & RESTART_DBG_GHOST) > + ckpt_debug("it was a ghost\n"); > + if (s->flags & RESTART_DBG_ROOT) > + ckpt_debug("it was the root task\n"); > + if (s->flags & RESTART_DBG_WAITING) > + ckpt_debug("it was still waiting to run restart\n"); > + if (s->flags & RESTART_DBG_RUNNING) > + ckpt_debug("it was running\n"); > + if (s->flags & RESTART_DBG_NORMAL) > + ckpt_debug("it was a normal task\n"); > + if (s->flags & RESTART_DBG_FAILED) > + ckpt_debug("it finished with error %d\n", s->error); > + if (s->flags & RESTART_DBG_FAILED) s/FAILED/SUCCESS/ ... :p [...] Oren. _______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers