Re: [PATCH] [RFC] c/r: Add UTS support

Cedric Le Goater <legoater@xxxxxxx> · Wed, 18 Mar 2009 10:01:00 +0100

Oren Laadan wrote:
> 
> Dan Smith wrote:
>> SH> Well it forces restart to go through the established userspace
>> SH> API's when creating resources (in this case, tasks and namespaces)
>> SH> which means any existing security guarantees are leveraged.
>>
>> That's a very valid point.  However, it still seems unbalanced to make
>> checkpoint a completely in-kernel process and restart an odd mix of
>> the two with potentially more confusing semantics and requirements.
>>
> 
> There are other reasons to allow restart to be not fully symmetric
> with respect to checkpoint. For example, if you have a smart(er) user
> space application that wants to provide the restart some of the resources
> pre-constructed, allowing much flexibility (already requested by people)
> for the restart provdure (E.g., when doing distributed checkpoint, or
> when restarting a special device whose).

yes

the arguments you have for restart are also valid for checkpoint in a 
distributed checkpoint scenario. 

you want to be able to easily and rapidly abort the checkpoint of a job 
when one node (among thousands) fails for some reason. a batch manager 
would use a signal.  

you also want fine grain synchronization for network, when migrating only 
one node. 

We've had to solve the above issues on a large HPC project and there are 
plenty of other good reasons to have a mix of kernel and user space for 
restart and for checkpoint. 

C.
_______________________________________________
Containers mailing list
Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx
https://lists.linux-foundation.org/mailman/listinfo/containers