Re: [PATCH 9/9][cr][v2]: Restore file-locks

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 05/26/2010 07:57 PM, Sukadev Bhattiprolu wrote:
> steve@xxxxxxxxxxx [steve@xxxxxxxxxxx] wrote:
> | Hi,
> | 
> | On Tue, May 18, 2010 at 08:07:32PM -0700, Sukadev Bhattiprolu wrote:
> | > Restore POSIX file-locks of an application from its checkpoint image.
> | > 
> | > Read the saved file-locks from the checkpoint image and for each POSIX
> | > lock, call flock_set() to set the lock on the file.
> | > 
> | > As pointed out by Matt Helsley, no special handling is necessary for a
> | > process P2 in the checkpointed container that is blocked on a lock, L1
> | > held by another process P1.  Since processes in the restarted container
> | > begin execution only after all processes have restored. If the blocked
> | > process P2 is restored first, first, it will prepare to return an
> | > -ERESTARTSYS from the fcntl() system call, but wait for P1 to be
> | > restored. When P1 is restored, it will re-acquire the lock L1 before P1
> | > and P2 begin actual execution. This ensures that even if P2 is scheduled
> | > to run before P1, P2 will go back to waiting for the lock L1.
> | >
> | Does that imply certain conditions wrt checkpointed processes and
> | NFS exports? I'm not sure I exactly undertstand the use case which
> | this is intended to address.
> 
> Well, yes this assumes some pre-requisites are met.
> 
> First lets look at a single system.  We expect that the application
> process tree is run inside a container. This means that the file
> system(s) (and other resources like pipes, IPC) that the application
> is working with are not modified by a process outside the container.

To be precise, we require that (a) resources won't change during
the checkpoint, and (b) the filesystem view at restart would be
the same as at checkpoint.

Running applications inside an isolated container is one way to
achieve that (and more so, to provide guarantees on that). Doing
that provides certain assurance on the resulting checkpoint image.

However, the requirements may be satisfied even outside a container
by, for example, a well behaved applications; except that then we
can't say it's safe -  it depends on the application.

> We also require that the application process tree be frozen before
> checkpointing the application. So even if the checkpoint process takes
> a few minutes, the state of the resources (files, pipes, signals etc)
> does not change since a) application is containerized b) container is
> frozen.
> 
> We already have the ability to run applications inside containers, using
> the clone() system call (see lxc.sf.net for example) and the ability to
> freeze the application using the freezer cgroup in the linux kenrnel.
> 
> | 
> | I was hoping to figure out whether it would also still be safe on
> | a cluster filesystem as well,
> 
> For clusters and NFS, an external protocol has to be established so that
> the distrubuted application can be started/frozen/checkpointed/restarated
> in a coordinated way.
> 
> I think that is something that would have to be built on top of the
> checkpoint/restart functionality that we are working on. Or maybe there
> are existing implementations that we would need to plug into.
> 
> Hope that helps, but its possible I missed your question :-). If so
> please let me know.

What you refer to is checkpoint in a cluster, or distributed checkpoint
of multiple cooperating processes/applications that run on multiple
hosts. Indeed, one simple way to do it is coordinated distributed
checkpoint and restart.

However, I think the question was about a single application (or
container) that is accessing remote and clustered (and possibly
distributed) *file systems*.

Ideally, we would like to have a method to snapshot a filesystem
at checkpoint to guarantee that at restart we have a consistent
view of that file system. This is regardless of whether the file
system is local or remote.  In the absence of such a mechanism,
we will have to rely on the file system not being changed (at least
those parts that the checkpoint application expects to remain as
they had been) until the restart.

Oren.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux