Re: [PATCH 4/4] cgroup freezer: --- replacement patch 4/4 (a)

Oren Laadan <orenl@xxxxxxxxxxxxxxx> · Thu, 4 Jun 2009 09:54:38 -0400 (EDT)

On Thu, 4 Jun 2009, Oren Laadan wrote:

> On Thu, 4 Jun 2009, Matt Helsley wrote:
> 
> > On Wed, Jun 03, 2009 at 08:10:24PM -0400, Oren Laadan wrote:
> > > 
> > > From 6cdb7d9504a19ad88e6da8ad85374267c7acb1b2 Mon Sep 17 00:00:00 2001
> > > From: Matt Helsley <matthltc@xxxxxxxxxx>
> > > Date: Wed, 3 Jun 2009 02:31:21 -0700
> > > Subject: [PATCH] cgroup freezer: Add CHECKPOINTING state to safeguard container checkpoint
> > > 
> > > The CHECKPOINTING state prevents userspace from unfreezing tasks until
> > > sys_checkpoint() is finished. When doing container checkpoint userspace
> > > will do:
> > > 
> > > 	echo FROZEN > /cgroups/my_container/freezer.state
> > > 	...
> > > 	rc = sys_checkpoint( <pid of container root> );
> > > 
> > > To ensure a consistent checkpoint image userspace should not be allowed
> > > to thaw the cgroup (echo THAWED > /cgroups/my_container/freezer.state)
> > > during checkpoint.
> > > 
> > > "CHECKPOINTING" can only be set on a "FROZEN" cgroup using the checkpoint
> > > system call. Once in the "CHECKPOINTING" state, the cgroup may not leave until
> > > the checkpoint system call is finished and ready to return. Then the
> > > freezer state returns to "FROZEN". Writing any new state to freezer.state while
> > > checkpointing will return EBUSY. These semantics ensure that userspace cannot
> > > unfreeze the cgroup midway through the checkpoint system call.
> > > 
> > > The cgroup_freezer_begin_checkpoint() and cgroup_freezer_end_checkpoint()
> > > make relatively few assumptions about the task that is passed in. However the
> > > way they are called in do_checkpoint() assumes that the root of the container
> > > is in the same freezer cgroup as all the other tasks that will be
> > > checkpointed.
> > > 
> > > Matt Helsley's wrote the original patch.
> > > 
> > > Changlog:
> > >   [2009-Jun-03] change cgroup_freezer_{begin,end}_checkpoint() to take a
> > >   		struct cgroup_subsys_state pointer
> > > 		add struct cgroup_subsys_state *get_task_cgroup_freezer()
> > > 
> > > Signed-off-by: Oren Laadan <orenl@xxxxxxxxxxxxxxx>
> > > Cc: Matt Helsley <matthltc@xxxxxxxxxx>
> > > Cc: Paul Menage <menage@xxxxxxxxxx>
> > > Cc: Li Zefan <lizf@xxxxxxxxxxxxxx>
> > > Cc: Cedric Le Goater <legoater@xxxxxxx>
> > > 
> > > Notes:
> > >         As a side-effect this prevents the multiple tasks from entering the
> > >                 CHECKPOINTING state simultaneously. All but one will get -EBUSY.
> > > ---
> > >  Documentation/cgroups/freezer-subsystem.txt |   10 ++
> > >  include/linux/freezer.h                     |   10 ++
> > >  kernel/cgroup_freezer.c                     |  149 +++++++++++++++++++--------
> > >  3 files changed, 127 insertions(+), 42 deletions(-)
> > > 
> > > diff --git a/Documentation/cgroups/freezer-subsystem.txt b/Documentation/cgroups/freezer-subsystem.txt
> > > index 41f37fe..92b68e6 100644
> > > --- a/Documentation/cgroups/freezer-subsystem.txt
> > > +++ b/Documentation/cgroups/freezer-subsystem.txt
> > > @@ -100,3 +100,13 @@ things happens:
> > >  		and returns EINVAL)
> > >  	3) The tasks that blocked the cgroup from entering the "FROZEN"
> > >  		state disappear from the cgroup's set of tasks.
> > > +
> > > +When the cgroup freezer is used to guard container checkpoint operations the
> > > +freezer.state may be "CHECKPOINTING". "CHECKPOINTING" can only be set on a
> > > +"FROZEN" cgroup using the checkpoint system call. Once in the "CHECKPOINTING"
> > > +state, the cgroup may not leave until the checkpoint system call returns the
> > > +freezer state to "FROZEN". Writing any new state to freezer.state while
> > > +checkpointing will return EBUSY. These semantics ensure that userspace cannot
> > > +unfreeze the cgroup midway through the checkpoint system call. Note that,
> > > +unlike "FROZEN" and "FREEZING", there is no corresponding "CHECKPOINTED"
> > > +state.
> > > diff --git a/include/linux/freezer.h b/include/linux/freezer.h
> > > index da7e52b..1402911 100644
> > > --- a/include/linux/freezer.h
> > > +++ b/include/linux/freezer.h
> > > @@ -64,12 +64,22 @@ extern bool freeze_task(struct task_struct *p, bool sig_only);
> > >  extern void cancel_freezing(struct task_struct *p);
> > > 
> > >  #ifdef CONFIG_CGROUP_FREEZER
> > > +struct cgroup_subsys_state;
> > >  extern int cgroup_freezing_or_frozen(struct task_struct *task);
> > > +extern struct cgroup_subsys_state *get_task_cgroup_freezer(struct task_struct *task);
> > > +extern int cgroup_freezer_begin_checkpoint(struct cgroup_subsys_state *css);
> > > +extern void cgroup_freezer_end_checkpoint(struct cgroup_subsys_state *css);
> > >  #else /* !CONFIG_CGROUP_FREEZER */
> > >  static inline int cgroup_freezing_or_frozen(struct task_struct *task)
> > >  {
> > >  	return 0;
> > >  }
> > > +static inline int cgroup_freezer_begin_checkpoint(struct task_struct *task)
> > > +{
> > > +	return -ENOTSUPP;
> > > +}
> > > +static inline void cgroup_freezer_end_checkpoint(struct task_struct *task)
> > > +{}
> > 
> > With CONFIG_CHECKPOINT depending on CONFIG_FREEZER I don't see why these
> > empty definitions are needed. Maybe it's just too late in the evening...
> 
> zzzzz....
> 
> > 
> > >  #endif /* !CONFIG_CGROUP_FREEZER */
> > > 
> > >  /*
> > > diff --git a/kernel/cgroup_freezer.c b/kernel/cgroup_freezer.c
> > > index 05795b7..4790fb9 100644
> > > --- a/kernel/cgroup_freezer.c
> > > +++ b/kernel/cgroup_freezer.c
> > > @@ -25,6 +25,7 @@ enum freezer_state {
> > >  	CGROUP_THAWED = 0,
> > >  	CGROUP_FREEZING,
> > >  	CGROUP_FROZEN,
> > > +	CGROUP_CHECKPOINTING,
> > >  };
> > > 
> > >  struct freezer {
> > > @@ -47,6 +48,18 @@ static inline struct freezer *task_freezer(struct task_struct *task)
> > >  			    struct freezer, css);
> > >  }
> > > 
> > > +struct cgroup_subsys_state *get_task_cgroup_freezer(struct task_struct *task)
> > > +{
> > > +	struct cgroup_subsys_state *css;
> > > +
> > > +	task_lock(task);
> > > +	css = task_subsys_state(task, freezer_subsys_id);
> > > +	css_get(css); /* make sure freezer doesn't go away */
> > > +	task_unlock(task);
> > > +
> > > +	return css;
> > > +}
> > 
> > Seems like there should be a better way to do this than grabbing
> > this reference. I'd prefer to just introduce:
> > 
> > int in_same_cgroup_freezer(struct task_struct *task1, 
> > 				struct task_struct *task2);
> > 
> > In the external checkpoint case the root task is frozen and hence cannot
> > change cgroups.
> > 
> > Regardless of that reference, I think the interaction between
> > sys_checkpoint() and the cgroup freezer as we have it right now is broken
> > in the self-checkpoint case. It doesn't work because the freezer allows a
> > task to freeze itself. So if it does simply:
> 
> There should be no interaction. 
> 
> While I do make sure to skip the freezer cgroup test for the self
> checkpoint test, I forgot to do it when calling begin/end checkpoint.
> 
> Since we don't require that a self checkpointing task freeze itself,
> this can be fixed by adding suitable 'if (t != current) before these
> two calls.

Ahhh ... scratch that. Call should remain, of course.

Instead, user can remove the checkpointer task (in self-checkpoint) 
from the freezer cgroup. The tests is may_checkpoint() already allow
current task to not be in the cgroup.

Oren.

_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/containers