[PATCH 02/32] vfs: Provide documentation for new mount API [ver #8]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Provide documentation for the new mount API.

Signed-off-by: David Howells <dhowells@xxxxxxxxxx>
---

 Documentation/filesystems/mounting.txt |  458 ++++++++++++++++++++++++++++++++
 1 file changed, 458 insertions(+)
 create mode 100644 Documentation/filesystems/mounting.txt

diff --git a/Documentation/filesystems/mounting.txt b/Documentation/filesystems/mounting.txt
new file mode 100644
index 000000000000..5230a9711b97
--- /dev/null
+++ b/Documentation/filesystems/mounting.txt
@@ -0,0 +1,458 @@
+			      ===================
+			      FILESYSTEM MOUNTING
+			      ===================
+
+CONTENTS
+
+ (1) Overview.
+
+ (2) The filesystem context.
+
+ (3) The filesystem context operations.
+
+ (4) Filesystem context security.
+
+ (5) VFS filesystem context operations.
+
+
+========
+OVERVIEW
+========
+
+The creation of new mounts is now to be done in a multistep process:
+
+ (1) Create a filesystem context.
+
+ (2) Parse the options and attach them to the context.  Options may be passed
+     individually from userspace.
+
+ (3) Validate and pre-process the context.
+
+ (4) Get or create a superblock and mountable root.
+
+ (5) Perform the mount.
+
+ (6) Return an error message attached to the context.
+
+ (7) Destroy the context.
+
+To support this, the file_system_type struct gains two new fields:
+
+	unsigned short fs_context_size;
+
+which indicates the total amount of space that should be allocated for context
+data (see the Filesystem Context section), and:
+
+	int (*init_fs_context)(struct fs_context *fc, struct super_block *src_sb);
+
+which is invoked to set up the filesystem-specific parts of a filesystem
+context, including the additional space.  The src_sb parameter is used to
+convey the superblock from which the filesystem may draw extra information
+(such as namespaces) for submount (FS_CONTEXT_FOR_SUBMOUNT) or reconfiguration
+(FS_CONTEXT_FOR_RECONFIGURE) purposes - otherwise it will be NULL.
+
+Note that security initialisation is done *after* the filesystem is called so
+that the namespaces may be adjusted first.
+
+And the super_operations struct gains one field:
+
+	int (*reconfigure) (struct super_block *, struct fs_context *);
+
+This shadows the ->reconfigure() operation and takes a prepared filesystem
+context instead of the mount flags and data page.  It may modify the sb_flags
+in the context for the caller to pick up.
+
+[NOTE] reconfigure is intended as a replacement for remount_fs.
+
+
+======================
+THE FILESYSTEM CONTEXT
+======================
+
+The creation and reconfiguration of a superblock is governed by a filesystem
+context.  This is represented by the fs_context structure:
+
+	struct fs_context {
+		const struct fs_context_operations *ops;
+		struct file_system_type *fs;
+		struct dentry		*root;
+		struct user_namespace	*user_ns;
+		struct net		*net_ns;
+		const struct cred	*cred;
+		char			*source;
+		char			*subtype;
+		void			*security;
+		void			*s_fs_info;
+		unsigned int		sb_flags;
+		bool			sloppy;
+		bool			silent;
+		bool			degraded;
+		bool			drop_sb;
+		enum fs_context_purpose	purpose : 8;
+	};
+
+When the VFS creates this, it allocates ->fs_context_size bytes (as specified
+by the file_system_type object) to hold both the fs_context struct and any
+extra data required by the filesystem.  The fs_context struct is placed at the
+beginning of this space.  Any extra space beyond that is for use by the
+filesystem.  The filesystem should wrap the struct in one of its own, e.g.:
+
+	struct nfs_fs_context {
+		struct fs_context fc;
+		...
+	};
+
+placing the fs_context struct first.  container_of() can then be used.  The
+file_system_type would be initialised thus:
+
+	struct file_system_type nfs = {
+		...
+		.fs_context_size	= sizeof(struct nfs_fs_context),
+		.init_fs_context	= nfs_init_fs_context,
+		...
+	};
+
+The fs_context fields are as follows:
+
+ (*) const struct fs_context_operations *ops
+
+     These are operations that can be done on a filesystem context (see
+     below).  This must be set by the ->init_fs_context() file_system_type
+     operation.
+
+ (*) struct file_system_type *fs
+
+     A pointer to the file_system_type of the filesystem that is being
+     constructed or reconfigured.  This retains a reference on the type owner.
+
+ (*) struct dentry *root
+
+     A pointer to the root of the mountable tree (and indirectly, the
+     superblock thereof).  This is filled in by the ->get_tree() op.
+
+ (*) struct user_namespace *user_ns
+ (*) struct net *net_ns
+
+     There are a subset of the namespaces in use by the invoking process.  They
+     retain references on each namespace.  The subscribed namespaces may be
+     replaced by the filesystem to reflect other sources, such as the parent
+     mount superblock on an automount.
+
+ (*) struct cred *cred
+
+     The mounter's credentials.  This retains a reference on the credentials.
+
+ (*) char *source
+
+     This specifies the source.  It may be a block device (e.g. /dev/sda1) or
+     something more exotic, such as the "host:/path" that NFS desires.
+
+ (*) char *subtype
+
+     This is a string to be added to the type displayed in /proc/mounts to
+     qualify it (used by FUSE).  This is available for the filesystem to set if
+     desired.
+
+ (*) void *security
+
+     A place for the LSMs to hang their security data for the superblock.  The
+     relevant security operations are described below.
+
+ (*) void *s_fs_info
+
+     The proposed s_fs_info for a new superblock, set in the superblock by
+     sget_fc().  This can be used to distinguish superblocks.
+
+ (*) unsigned int sb_flags
+
+     This holds the SB_* flags to be set in super_block::s_flags.
+
+ (*) bool sloppy
+ (*) bool silent
+
+     These are set if the sloppy or silent mount options are given.
+
+     [NOTE] sloppy is probably unnecessary when userspace passes over one
+     option at a time since the error can just be ignored if userspace deems it
+     to be unimportant.
+
+     [NOTE] silent is probably redundant with sb_flags & SB_SILENT.
+
+ (*) bool degraded
+
+     This is set if any preallocated resources in the context have been used
+     up, thereby rendering it unreusable for the ->get_tree() op.
+
+ (*) bool drop_sb
+
+     This is set if a superblock reference needs to be deactivated when the
+     context is put.
+
+ (*) enum fs_context_purpose
+
+     This indicates the purpose for which the context is intended.  The
+     available values are:
+
+	FS_CONTEXT_FOR_USER_MOUNT,	-- New superblock for user-specified mount
+	FS_CONTEXT_FOR_KERNEL_MOUNT,	-- New superblock for kernel-internal mount
+	FS_CONTEXT_FOR_SUBMOUNT		-- New automatic submount of extant mount
+	FS_CONTEXT_FOR_RECONFIGURE	-- Change an existing mount
+
+The mount context is created by calling vfs_new_fs_context(), vfs_sb_reconfig()
+or vfs_dup_fs_context() and is destroyed with put_fs_context().  Note that the
+structure is not refcounted.
+
+VFS, security and filesystem mount options are set individually with
+vfs_parse_mount_option().  Options provided by the old mount(2) system call as
+a page of data can be parsed with generic_parse_monolithic().
+
+When mounting, the filesystem is allowed to take data from any of the pointers
+and attach it to the superblock (or whatever), provided it clears the pointer
+in the mount context.
+
+The filesystem is also allowed to allocate resources and pin them with the
+mount context.  For instance, NFS might pin the appropriate protocol version
+module.
+
+
+=================================
+THE FILESYSTEM CONTEXT OPERATIONS
+=================================
+
+The filesystem context points to a table of operations:
+
+	struct fs_context_operations {
+		void (*free)(struct fs_context *fc);
+		int (*dup)(struct fs_context *fc, struct fs_context *src_fc);
+		int (*parse_source)(struct fs_context *fc, char *source);
+		int (*parse_option)(struct fs_context *fc, char *opt, size_t len);
+		int (*parse_monolithic)(struct fs_context *fc, void *data);
+		int (*validate)(struct fs_context *fc);
+		int (*get_tree)(struct fs_context *fc);
+	};
+
+These operations are invoked by the various stages of the mount procedure to
+manage the filesystem context.  They are as follows:
+
+ (*) void (*free)(struct fs_context *fc);
+
+     Called to clean up the filesystem-specific part of the filesystem context
+     when the context is destroyed.  It should be aware that parts of the
+     context may have been removed and NULL'd out by ->get_tree().
+
+ (*) int (*dup)(struct fs_context *fc, struct fs_context *src_fc);
+
+     Called when a filesystem context has been duplicated to get any refs or
+     copy any non-referenced resources held in the filesystem-specific part of
+     the filesystem context.  An error may be returned to indicate failure to
+     do this.
+
+     [!] Note that even if this fails, put_fs_context() will be called
+	 immediately thereafter, so ->dup() *must* make the
+	 filesystem-specific part safe for ->free().
+
+ (*) int (*parse_source)(struct fs_context *fc, char *p);
+
+     Called when a source or device is specified for a filesystem context.
+     This may be called multiple times if the filesystem supports it.  If
+     successful, 0 should be returned or a negative error code otherwise.
+
+ (*) int (*parse_option)(struct fs_context *fc, char *p);
+
+     Called when an option is to be added to the filesystem context.  p points
+     to the option string, likely in "key[=val]" format.  VFS-specific options
+     will have been weeded out and fc->sb_flags updated in the context.
+     Security options will also have been weeded out and fc->security updated.
+
+     If successful, 0 should be returned or a negative error code otherwise.
+
+ (*) int (*parse_monolithic)(struct fs_context *fc, void *data);
+
+     Called when the mount(2) system call is invoked to pass the entire data
+     page in one go.  If this is expected to be just a list of "key[=val]"
+     items separated by commas, then this may be set to NULL.
+
+     The return value is as for ->parse_option().
+
+     If the filesystem (e.g. NFS) needs to examine the data first and then
+     finds it's the standard key-val list then it may pass it off to
+     generic_parse_monolithic().
+
+ (*) int (*validate)(struct fs_context *fc);
+
+     Called when all the options have been applied and the mount is about to
+     take place.  It is should check for inconsistencies from mount options and
+     it is also allowed to do preliminary resource acquisition.  For instance,
+     the core NFS module could load the NFS protocol module here.
+
+     Note that if fc->purpose == FS_CONTEXT_FOR_RECONFIGURE, some of the
+     options necessary for a new mount may not be set.
+
+     The return value is as for ->parse_option().
+
+ (*) int (*get_tree)(struct fs_context *fc);
+
+     Called to get or create the mountable root and superblock, using the
+     information stored in the filesystem context (reconfiguration goes via a
+     different vector).  It may detach any resources it desires from the
+     filesystem context and transfer them to the superblock it creates.
+
+     On success it should set fc->root to the mountable root and return 0.  In
+     the case of an error, it should return a negative error code.
+
+
+===========================
+FILESYSTEM CONTEXT SECURITY
+===========================
+
+The filesystem context contains a security pointer that the LSMs can use for
+building up a security context for the superblock to be mounted.  There are a
+number of operations used by the new mount code for this purpose:
+
+ (*) int security_fs_context_alloc(struct fs_context *fc,
+				   struct super_block *src_sb);
+
+     Called to initialise fc->security (which is preset to NULL) and allocate
+     any resources needed.  It should return 0 on success or a negative error
+     code on failure.
+
+     src_sb will be non-NULL if the context is being created for superblock
+     reconfiguration (FS_CONTEXT_FOR_RECONFIGURE) in which case it indicates
+     the superblock to be reconfigured.  It will also be non-NULL in the case
+     of a submount (FS_CONTEXT_FOR_SUBMOUNT) in which case it indicates the
+     parent superblock.
+
+ (*) int security_fs_context_dup(struct fs_context *fc,
+				 struct fs_context *src_fc);
+
+     Called to initialise fc->security (which is preset to NULL) and allocate
+     any resources needed.  The original filesystem context is pointed to by
+     src_fc and may be used for reference.  It should return 0 on success or a
+     negative error code on failure.
+
+ (*) void security_fs_context_free(struct fs_context *fc);
+
+     Called to clean up anything attached to fc->security.  Note that the
+     contents may have been transferred to a superblock and the pointer cleared
+     during get_tree.
+
+ (*) int security_fs_context_parse_source(struct fs_context *fc, char *src);
+
+     Called for each source (there may be more than one if the filesystem
+     supports it).  The arguments are as for the ->parse_source() method.  It
+     should return 0 on success or a negative error code on failure.
+
+ (*) int security_fs_context_parse_option(struct fs_context *fc, char *opt);
+
+     Called for each mount option.  The arguments are as for the
+     ->parse_option() method.  It should return 0 to indicate that the option
+     should be passed on to the filesystem, 1 to indicate that the option
+     should be discarded or an error to indicate that the option should be
+     rejected.
+
+     The buffer pointed to by opt may be modified.
+
+ (*) int security_fs_context_validate(struct fs_context *fc);
+
+     Called after all the options have been parsed to validate the collection
+     as a whole and to do any necessary allocation so that
+     security_sb_get_tree() is less likely to fail.  It should return 0 or a
+     negative error code.
+
+ (*) int security_sb_get_tree(struct fs_context *fc);
+
+     Called during the mount procedure to verify that the specified superblock
+     is allowed to be mounted and to transfer the security data there.  It
+     should return 0 or a negative error code.
+
+ (*) int security_sb_mountpoint(struct fs_context *fc, struct path *mountpoint);
+
+     Called during the mount procedure to verify that the root dentry attached
+     to the context is permitted to be attached to the specified mountpoint.
+     It should return 0 on success or a negative error code on failure.
+
+
+=================================
+VFS FILESYSTEM CONTEXT OPERATIONS
+=================================
+
+There are four operations for creating a filesystem context and
+one for destroying a context:
+
+ (*) struct fs_context *vfs_new_fs_context(struct file_system_type *fs_type,
+					   struct super_block *src_sb,
+					   unsigned int sb_flags);
+
+     Create a filesystem context for a given filesystem type.  This allocates
+     the filesystem context, sets the flags, initialises the security and calls
+     fs_type->init_fs_context() to initialise the filesystem context.
+
+     src_sb can be NULL or it may indicate a superblock that is going to be
+     reconfigured (FS_CONTEXT_FOR_RECONFIGURE) or a superblock that is the
+     parent of a submount (FS_CONTEXT_FOR_SUBMOUNT).  This superblock is
+     provided as a source of namespace information.
+
+ (*) struct fs_context *vfs_sb_reconfigure(struct vfsmount *mnt,
+					   unsigned int sb_flags);
+
+     Create a filesystem context from the same filesystem as an extant mount
+     and initialise the mount parameters from the superblock underlying that
+     mount.  This is for use by superblock parameter reconfiguration.
+
+ (*) struct fs_context *vfs_dup_fs_context(struct fs_context *src_fc);
+
+     Duplicate a filesystem context, copying any options noted and duplicating
+     or additionally referencing any resources held therein.  This is available
+     for use where a filesystem has to get a mount within a mount, such as NFS4
+     does by internally mounting the root of the target server and then doing a
+     private pathwalk to the target directory.
+
+ (*) void put_fs_context(struct fs_context *fc);
+
+     Destroy a filesystem context, releasing any resources it holds.  This
+     calls the ->free() operation.  This is intended to be called by anyone who
+     created a filesystem context.
+
+     [!] filesystem contexts are not refcounted, so this causes unconditional
+	 destruction.
+
+In all the above operations, apart from the put op, the return is a mount
+context pointer or a negative error code.
+
+For the remaining operations, if an error occurs, a negative error code will be
+returned.
+
+ (*) int vfs_get_tree(struct fs_context *fc);
+
+     Get or create the mountable root and superblock, using the parameters in
+     the filesystem context to select/configure the superblock.  This invokes
+     the ->validate() op and then the ->get_tree() op.
+
+     [NOTE] ->validate() could perhaps be rolled into ->get_tree() and
+     ->reconfigure().
+
+ (*) struct vfsmount *vfs_create_mount(struct fs_context *fc);
+
+     Create a mount given the parameters in the specified filesystem context.
+     Note that this does not attach the mount to anything.
+
+ (*) int vfs_set_fs_source(struct fs_context *fc, char *source);
+
+     Supply one or more source names or device names for the mount.  This may
+     cause the filesystem to access the source.  Multiple sources may be
+     specified if the filesystem supports it.
+
+ (*) int vfs_parse_fs_option(struct fs_context *fc, char *data);
+
+     Supply a single mount option to the filesystem context.  The mount option
+     should likely be in a "key[=val]" string form.  The option is first
+     checked to see if it corresponds to a standard mount flag (in which case
+     it is used to set an SB_xxx flag and consumed) or a security option (in
+     which case the LSM consumes it) before it is passed on to the filesystem.
+
+ (*) int generic_parse_monolithic(struct fs_context *fc, void *data);
+
+     Parse a sys_mount() data page, assuming the form to be a text list
+     consisting of key[=val] options separated by commas.  Each item in the
+     list is passed to vfs_mount_option().  This is the default when the
+     ->parse_monolithic() operation is NULL.




[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux