Re: [PATCH 00/16][cr][v3]: C/R file owner, locks, leases

Oren Laadan <orenl@xxxxxxxxxxxxxxx> · Wed, 04 Aug 2010 14:03:50 -0400

On 08/04/2010 01:26 PM, Matt Helsley wrote:
On Wed, Aug 04, 2010 at 11:45:20AM +0100, Steven Whitehouse wrote:
Hi,

On Tue, 2010-08-03 at 16:11 -0700, Sukadev Bhattiprolu wrote:
Checkpoint/restart file owner, file-locks and file-lease information.

Can you explain roughly how this is intended to work, or point me at a
document explaining it?

I'm trying to figure out how the file lock checkpoint will work with
cluster filesystems, or if there needs to be a mechanism to turn this
feature off for those filesystems. What prevents the lock state changing
in an incompatible way between the checkpoint and the restore?

Hi Steve,

In addition to Matt's reply -

Checkpoint/restart _assumes_ that there exists a mechanism to keep
the filesystem state _unchanged_ between checkpoint and restart.

For example, one can kill the application after checkpoint and keep
the filesystem from being touched.
A more likely scenario is to use a filesystem's snapshot/backup
solution during checkpoint to ensure a pristine copy for restart.
In particular, there needs to be a mechanism to accomplish this
in a cluster filesystem, or rely on dedicated userspace tools.

So at restart, the filesystem is assumed to be visible and in the
same state as before. That state also includes locks etc.

Also, c/r has a mechanism to detect cases where a file in use by
the checkpoint application(s) is shared with a task that is not
being checkpointed. In this case, checkpoint will fail, to prevent
inconsistencies.

(I also imagine that often a cluster filesystem is used by parallel
applications - which in turn require some support to be checkpointed
in a consisted manner).

Oren.

Hi Steve,

[ I'm just going to address your cluster filesystem question and let
   Suka answer your questions on these patches. ]

	Open files whose file operations structs are missing the
.checkpoint operation cause checkpoint to fail. We haven't added a
.checkpoint operation to cluster filesystems because of the kinds of
issues you're referring to.

	I don't think there are any file locks/leases which do not
require opening the file(s) in question. That means file locks
and leases in cluster filesystems should also cause checkpoint
to fail.

	Each cluster filesystem probably needs some special care when
considering the use of the generic_file_checkpoint operation.

	Using generic_file_checkpoint is appropriate when we have some
way to get a consistent image of the filesystem at the time checkpoint
takes place. How that happens is largely up to the userspace tools
called user-cr. Device-mapper snapshots, fsfreezer + rsync, and
filesystem snapshots will all work. Of course those tools usually don't
save more volatile state information like locks.

	It's quite possible cluster filesystems will need their own
.checkpoint file operations. generic_file_checkpoint is composed of a few
smaller functions which could make writing such ops easier. For example,
we've already reused the smaller functions in .checkpoint operations for
anon_inode-based interfaces, pipes, fifos, and more,

	What it may come down to is this: How do you backup a cluster
filesystem? If there's already a backup method that works then we can
write the .checkpoint operation to rely on it. Often that means we
can use generic_file_checkpoint. The "backup method" should be
something which can be invoked by the userspace checkpoint/restart tools
(user-cr). If the backup method is too slow we can work on
improving it or we can try something else.

	So perhaps the best thing we can do to help you is learn how
folks backup their cluster filesystems. Got any pointers to basic info
on that?

Cheers,
	-Matt Helsley

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html