From: Oren Laadan <orenl@xxxxxxxxxxxxxxx> While we assume all normal files and directories can be checkpointed, there are, as usual in the VFS, specialized places that will always need an ability to override these defaults. Although we could do this completely in the checkpoint code, that would bitrot quickly. This adds a new 'file_operations' function for checkpointing a file. It is assumed that there should be a dirt-simple way to make something (un)checkpointable that fits in with current code. As you can see in the ext[234] patches down the road, all that we have to do to make something simple be supported is add a single "generic" f_op entry. Also adds a new 'file_operations' function for 'collecting' a file for leak-detection during full-container checkpoint. This is useful for those files that hold references to other "collectable" objects. Two examples are pty files that point to corresponding tty objects, and eventpoll files that refer to the files they are monitoring. Finally, this patch introduces vfs_fcntl() so that it can be called from restart (see patch adding restart of files). Changelog[v21] - Update Documentation/filesystem/vfs.txt - Put file_ops->checkpoint under CONFIG_CHECKPOINT Changelog[v17] - Introduce 'collect' method Changelog[v17] - Forward-declare 'ckpt_ctx' et-al, don't use checkpoint_types.h Cc: linux-fsdevel@xxxxxxxxxxxxxxx Signed-off-by: Oren Laadan <orenl@xxxxxxxxxxxxxxx> Acked-by: Serge E. Hallyn <serue@xxxxxxxxxx> Tested-by: Serge E. Hallyn <serue@xxxxxxxxxx> --- Documentation/filesystems/vfs.txt | 13 ++++++++++++- include/linux/fs.h | 5 +++++ 2 files changed, 17 insertions(+), 1 deletions(-) diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt index ed7e5ef..0564eaf 100644 --- a/Documentation/filesystems/vfs.txt +++ b/Documentation/filesystems/vfs.txt @@ -716,7 +716,7 @@ struct file_operations ---------------------- This describes how the VFS can manipulate an open file. As of kernel -2.6.22, the following members are defined: +2.6.34, the following members are defined: struct file_operations { struct module *owner; @@ -746,6 +746,10 @@ struct file_operations { int (*flock) (struct file *, int, struct file_lock *); ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, size_t, unsigned int); ssize_t (*splice_read)(struct file *, struct pipe_inode_info *, size_t, unsigned int); +#ifdef CONFIG_CHECKPOINT + int (*checkpoint)(struct ckpt_ctx *, struct file *); + int (*collect)(struct ckpt_ctx *, struct file *); +#endif }; Again, all methods are called without any locks being held, unless @@ -814,6 +818,13 @@ otherwise noted. splice_read: called by the VFS to splice data from file to a pipe. This method is used by the splice(2) system call + checkpoint: called by checkpoint(2) system call to checkpoint the + state of a file descriptor. + + collect: called by the checkpoint(2) system call to track references to + file descriptors, to detect leaks in full-container checkpoint + (see Documentation/checkpoint/readme.txt). + Note that the file operations are implemented by the specific filesystem in which the inode resides. When opening a device node (character or block special) most filesystems will call special diff --git a/include/linux/fs.h b/include/linux/fs.h index b0f8706..2a90d03 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -407,6 +407,7 @@ struct kstatfs; struct vm_area_struct; struct vfsmount; struct cred; +struct ckpt_ctx; extern void __init inode_init(void); extern void __init inode_init_early(void); @@ -1544,6 +1545,10 @@ struct file_operations { ssize_t (*splice_write)(struct pipe_inode_info *, struct file *, loff_t *, size_t, unsigned int); ssize_t (*splice_read)(struct file *, loff_t *, struct pipe_inode_info *, size_t, unsigned int); int (*setlease)(struct file *, long, struct file_lock **); +#ifdef CONFIG_CHECKPOINT + int (*checkpoint)(struct ckpt_ctx *, struct file *); + int (*collect)(struct ckpt_ctx *, struct file *); +#endif }; struct inode_operations { -- 1.7.2.2 -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html