Oren Laadan a écrit : > > Matthieu Fertré wrote: >> Hi, >> >> Louis Rilling a écrit : >>> On 29/04/09 18:47 -0400, Oren Laadan wrote: >>>> Hi Louis, >>>> >>>> Louis Rilling wrote: >>>>> Hi, >>>>> >>>>> On 28/04/09 19:23 -0400, Oren Laadan wrote: >>>>>> Here is the latest and greatest of checkpoint/restart (c/r) patchset. >>>>>> The logic and image format reworked and simplified, code refactored, >>>>>> support for PPC, s390, sysvipc, shared memory of all sorts, namespaces >>>>>> (uts and ipc). >>>>> I should have asked before, but what are the reasons to checkpoint SYSV IPCs >>>>> in the same file/stream as tasks? Would it be better to checkpoint them >>>>> independently, like the file system state? >>>>> >>>>> In Kerrighed we chose to checkpoint SYSV IPCs independently, a bit like the file >>>>> system state, because SYSV IPCs objects' lifetime do not depend on tasks >>>>> lifetime, and we can gain more flexibility this way. In particular we envision >>>>> cases in which two applications share a state in a SYSV SHM (something like a >>>>> producer-consumer scheme), but do not need to be checkpointed together. In such >>>>> a case the SYSV SHM itself could even need more high-availability (using >>>>> active replication) than a checkpoint/restart facility. >>>>> >>>> Thanks for the feedback, this is actually an interesting idea. >>>> >>>> Indeed in the past I also considered SYSV IPC to be a "global" resource >>>> that was checkpointed before iterating through the tasks. >>>> >>>> However, in the presence of namespaces, the lifetime of an IPC namespace >>>> does depend on on tasks lifetime - when the last task referring to a >>>> given namespace exits - that namespace is destroyed. Of course, the >>>> root namespace is truly global, because init(1) never exits. >>>> >>>> What would 'checkpoint them independently' mean in this case ? >>> I mean that the producer and the consumer could have separate checkpointing >>> policies (if any), and the IPC SHM as well. >>> >>>> In your use-case, can you restart either application without first >>>> restoring the relevant SYSVIPC ? >>> Probably not. >>> >> Well, it depends. It has no sense to restart the application without >> restoring the relevant SHM but it may have for a message queue (this is >> application specific of course). Message queue is not linked to the >> process, it can disappear during the life of the application. > > Agreed - the concern regards mainly the SHM case. > >>>> Can you think of other use-cases for such a division ? Am I right to >>>> guess that your use case is specific to the distributed (and SSI-) >>>> nature of your system ? (Active-replication of SYSV_SHM sounds >>>> awfully related to DSM :) >>> The case of active-replication may be specific to DSM-based systems, but the >>> case of independent policies is already interesting in standalone boxes. >>> >>>> While not focusing on such use cases, I want to keep the design flexible >>>> enough to not exclude them a-priori, and be able to address them later >>>> on. Indeed, the code is split such that the the function to save a given >>>> IPC namespace does not depend on the task that uses it. Future code >>>> could easily use the same functionality. >>>> >>>> One way to be flexible to support your use case, is by having some >>>> mechanism in place to select whether a resource (virtually any) is >>>> to be chekcpointed/restored. >>>> >>>> For example, you could imagine checkpoint(..., CHECKPOINT_SYSVIPC) >>>> to checkpoint (also) IPC, and not checkpoint IPC in its absence. >>>> >>>> So normally you'd have checkpoint(..., CHECKPOINT_ALL). When you don't >>>> want IPC, you'd use CHECKPOINT_ALL & ~CHECKPOINT_SYSVIPC. When you >>>> want only IPC, you'd use CHECKPOINT_SYSVIPC only. >>>> >>>> Same thing for restart, only that it will get trickier in the "only IPC" >>>> case, since you will need to tell which IPC namespace is affected. >>>> >>>> Also, I envision a task saying cradvise(CHECKPOINT_SYSVIPC, false), >>>> telling the kernel to not c/r its IPC namespace. (Or any other >>>> resource). Again there would need to be a way to add a restored >>>> namespace. >>>> >>>> Does this address your concerns ? >>> Yes this sounds flexible enough. Thanks for taking this into account. >> I see one drawback with this approach if you allow checkpoint of >> application that is not isolated in a container. In that case, you may >> want to select which IPC objects to dump to not dump all the IPC objects >> living in the system. Indeed, this is why we have chosen in Kerrighed to >> checkpoint IPC objects independently of tasks, since we have no >> container/namespaces support currently. > > I assume that in this case it will be the application itself that > will somehow tell the system which specific sysvipc objects (ids) it > cares about. Sure, the system can not know it. > > (I'm not sure how would the system otherwise know what to dump and > what to leave out). > > I originally proposed the construct of cradvise() syscall to handle > exactly those cases where the application would like to advise the > kernel about certain resources. So, extending the previous example, > a task may call something like: > > cradvise(CHECKPOINT_SYSVIPC_SHM, false); /* generally skip shm */ > cradvise(CHECKPOINT_SYSVIPC_SHMID, id, true); /* but include this */ > > or: > cradvise(CHECKPOINT_SYSVIPC_SHM, true); /* generally include shm */ > cradvise(CHECKPOINT_SYSVIPC_SHMID, id, false); /* but skip this */ > > Anyway, these are just examples of the concept and what sort of generic > interface can be used to implement it; don't pick on the details... Ok, seems good :) Thanks, Matthieu
Attachment:
signature.asc
Description: OpenPGP digital signature
_______________________________________________ Containers mailing list Containers@xxxxxxxxxxxxxxxxxxxxxxxxxx https://lists.linux-foundation.org/mailman/listinfo/containers