Re: [PATCH 5/6] ovl: copy on read with consistent_fd feature

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Mar 29, 2017 at 05:36:05PM +0300, Amir Goldstein wrote:
> Overlayfs copies up a file when the file is opened for write.  Writes
> to the returned file descriptor are not visible to file descriptors
> that were opened for read prior to copy up.
> 
> If this config option is enabled then overlay filesystems will default
> to copy up a file that is opened for read to avoid this inconsistency.
> In this case, it is still possible to turn off consistent fd globally
> with the "consistent_fd=off" module option or on a filesystem instance
> basis with the "consistent_fd=off" mount option.

Hi Amir,

So all readers will pay a small cost of copying up file always (as temp
file). I am curious to know if that cost is significant.

Are there any other downsides of opting in for this behavior by default.
I am assuming page cache usage will not be higher due to clone operation.

Vivek


> 
> The feature will not be turned on for an overlay mount unless all
> the layers are on the same underlying filesystem and this filesystem
> supports clone.
> 
> This introduces a slight change in d_real() semantics. Now d_real()
> may return an error with NULL inode argument also in the zero open
> flags case. vfs API documentation has been updated.
> 
> Signed-off-by: Amir Goldstein <amir73il@xxxxxxxxx>
> ---
>  Documentation/filesystems/vfs.txt | 13 ++++++-------
>  fs/overlayfs/Kconfig              | 18 ++++++++++++++++++
>  fs/overlayfs/inode.c              | 27 ++++++++++++++++-----------
>  fs/overlayfs/overlayfs.h          |  4 +++-
>  fs/overlayfs/ovl_entry.h          |  2 ++
>  fs/overlayfs/super.c              | 37 +++++++++++++++++++++++++++++++++----
>  fs/overlayfs/util.c               |  7 +++++++
>  7 files changed, 85 insertions(+), 23 deletions(-)
> 
> diff --git a/Documentation/filesystems/vfs.txt b/Documentation/filesystems/vfs.txt
> index 5692117..0dd317b 100644
> --- a/Documentation/filesystems/vfs.txt
> +++ b/Documentation/filesystems/vfs.txt
> @@ -1088,22 +1088,21 @@ struct dentry_operations {
>  	dentry being transited from.
>  
>    d_real: overlay/union type filesystems implement this method to return one of
> -	the underlying dentries hidden by the overlay.  It is used in three
> +	the underlying dentries hidden by the overlay.  It is used in two
>  	different modes:
>  
>  	Called from open it may need to copy-up the file depending on the
> -	supplied open flags.  This mode is selected with a non-zero flags
> -	argument.  In this mode the d_real method can return an error.
> +	supplied open flags.  It returns the upper real dentry if file was
> +	copied up or the topmost real underlying dentry otherwise.
> +	This mode is selected with a NULL inode argument.
> +	In this mode the d_real method can return an error.
>  
>  	Called from file_dentry() it returns the real dentry matching the inode
>  	argument.  The real dentry may be from a lower layer already copied up,
>  	but still referenced from the file.  This mode is selected with a
>  	non-NULL inode argument.  This will always succeed.
>  
> -	With NULL inode and zero flags the topmost real underlying dentry is
> -	returned.  This will always succeed.
> -
> -	This method is never called with both non-NULL inode and non-zero flags.
> +	This method is never called with non-NULL inode and non-zero open flags.
>  
>  Each dentry has a pointer to its parent dentry, as well as a hash list
>  of child dentries. Child dentries are basically like files in a
> diff --git a/fs/overlayfs/Kconfig b/fs/overlayfs/Kconfig
> index 0daac51..7425862 100644
> --- a/fs/overlayfs/Kconfig
> +++ b/fs/overlayfs/Kconfig
> @@ -22,3 +22,21 @@ config OVERLAY_FS_REDIRECT_DIR
>  	  Note, that redirects are not backward compatible.  That is, mounting
>  	  an overlay which has redirects on a kernel that doesn't support this
>  	  feature will have unexpected results.
> +
> +config OVERLAY_FS_CONSISTENT_FD
> +	bool "Overlayfs: turn on consistent fd feature by default"
> +	depends on OVERLAY_FS
> +	help
> +	  Overlayfs copies up a file when the file is opened for write.  Writes
> +          to the returned file descriptor are not visible to file descriptors
> +          that were opened for read prior to copy up.
> +
> +          If this config option is enabled then overlay filesystems will default
> +          to copy up a file that is opened for read to avoid this inconsistency.
> +          In this case, it is still possible to turn off consistent fd globally
> +          with the "consistent_fd=off" module option or on a filesystem instance
> +          basis with the "consistent_fd=off" mount option.
> +
> +          The feature will not be turned on for an overlay mount unless all
> +          the layers are on the same underlying filesystem and this filesystem
> +          supports clone.
> diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c
> index 17b8418..f2f55e1 100644
> --- a/fs/overlayfs/inode.c
> +++ b/fs/overlayfs/inode.c
> @@ -231,7 +231,7 @@ struct posix_acl *ovl_get_acl(struct inode *inode, int type)
>  }
>  
>  static bool ovl_open_need_copy_up(int flags, enum ovl_path_type type,
> -				  struct dentry *realdentry)
> +				  struct dentry *realdentry, bool rocopyup)
>  {
>  	if (OVL_TYPE_UPPER(type))
>  		return false;
> @@ -239,25 +239,30 @@ static bool ovl_open_need_copy_up(int flags, enum ovl_path_type type,
>  	if (special_file(realdentry->d_inode->i_mode))
>  		return false;
>  
> +	/* Copy up on open for read for consistent fd */
> +	if (rocopyup)
> +		return true;
> +
>  	if (!(OPEN_FMODE(flags) & FMODE_WRITE) && !(flags & O_TRUNC))
>  		return false;
>  
>  	return true;
>  }
>  
> -int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags)
> +int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags,
> +			   bool rocopyup)
>  {
>  	int err = 0;
>  	struct path realpath;
> -	enum ovl_path_type type;
> -
> -	type = ovl_path_real(dentry, &realpath);
> -	if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) {
> -		err = ovl_want_write(dentry);
> -		if (!err) {
> -			err = ovl_copy_up_flags(dentry, file_flags);
> -			ovl_drop_write(dentry);
> -		}
> +	enum ovl_path_type type = ovl_path_real(dentry, &realpath);
> +
> +	if (!ovl_open_need_copy_up(file_flags, type, realpath.dentry, rocopyup))
> +		return 0;
> +
> +	err = ovl_want_write(dentry);
> +	if (!err) {
> +		err = ovl_copy_up_flags(dentry, file_flags);
> +		ovl_drop_write(dentry);
>  	}
>  
>  	return err;
> diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h
> index 079051e..d13ad5f 100644
> --- a/fs/overlayfs/overlayfs.h
> +++ b/fs/overlayfs/overlayfs.h
> @@ -169,6 +169,7 @@ void ovl_set_dir_cache(struct dentry *dentry, struct ovl_dir_cache *cache);
>  bool ovl_dentry_is_opaque(struct dentry *dentry);
>  bool ovl_dentry_is_whiteout(struct dentry *dentry);
>  void ovl_dentry_set_opaque(struct dentry *dentry);
> +bool ovl_consistent_fd(struct super_block *sb);
>  bool ovl_redirect_dir(struct super_block *sb);
>  void ovl_clear_redirect_dir(struct super_block *sb);
>  const char *ovl_dentry_get_redirect(struct dentry *dentry);
> @@ -207,7 +208,8 @@ int ovl_xattr_get(struct dentry *dentry, const char *name,
>  		  void *value, size_t size);
>  ssize_t ovl_listxattr(struct dentry *dentry, char *list, size_t size);
>  struct posix_acl *ovl_get_acl(struct inode *inode, int type);
> -int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags);
> +int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags,
> +			   bool rocopyup);
>  int ovl_update_time(struct inode *inode, struct timespec *ts, int flags);
>  bool ovl_is_private_xattr(const char *name);
>  
> diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h
> index fb1210d..c11a72d 100644
> --- a/fs/overlayfs/ovl_entry.h
> +++ b/fs/overlayfs/ovl_entry.h
> @@ -14,6 +14,7 @@ struct ovl_config {
>  	char *workdir;
>  	bool default_permissions;
>  	bool redirect_dir;
> +	bool consistent_fd;
>  };
>  
>  /* private information held for overlayfs's superblock */
> @@ -30,6 +31,7 @@ struct ovl_fs {
>  	bool tmpfile;	/* upper supports O_TMPFILE */
>  	bool samefs;	/* all layers on same fs */
>  	bool cloneup;	/* can clone from lower to upper */
> +	bool rocopyup;	/* copy up on open for read */
>  	wait_queue_head_t copyup_wq;
>  };
>  
> diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c
> index 75b93d6..e5fd53a 100644
> --- a/fs/overlayfs/super.c
> +++ b/fs/overlayfs/super.c
> @@ -34,6 +34,11 @@ module_param_named(redirect_dir, ovl_redirect_dir_def, bool, 0644);
>  MODULE_PARM_DESC(ovl_redirect_dir_def,
>  		 "Default to on or off for the redirect_dir feature");
>  
> +static bool ovl_consistent_fd_def = IS_ENABLED(CONFIG_OVERLAY_FS_CONSISTENT_FD);
> +module_param_named(consistent_fd, ovl_consistent_fd_def, bool, 0644);
> +MODULE_PARM_DESC(ovl_consistent_fd_def,
> +		 "Default to on or off for the consistent_fd feature");
> +
>  static void ovl_dentry_release(struct dentry *dentry)
>  {
>  	struct ovl_entry *oe = dentry->d_fsdata;
> @@ -54,6 +59,10 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
>  				 unsigned int open_flags)
>  {
>  	struct dentry *real;
> +	bool rocopyup = !inode && ovl_consistent_fd(dentry->d_sb);
> +
> +	if (WARN_ON(open_flags && inode))
> +		return dentry;
>  
>  	if (!d_is_reg(dentry)) {
>  		if (!inode || inode == d_inode(dentry))
> @@ -64,8 +73,8 @@ static struct dentry *ovl_d_real(struct dentry *dentry,
>  	if (d_is_negative(dentry))
>  		return dentry;
>  
> -	if (open_flags) {
> -		int err = ovl_open_maybe_copy_up(dentry, open_flags);
> +	if (open_flags || rocopyup) {
> +		int err = ovl_open_maybe_copy_up(dentry, open_flags, rocopyup);
>  
>  		if (err)
>  			return ERR_PTR(err);
> @@ -227,6 +236,10 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry)
>  	if (ufs->config.redirect_dir != ovl_redirect_dir_def)
>  		seq_printf(m, ",redirect_dir=%s",
>  			   ufs->config.redirect_dir ? "on" : "off");
> +	if (!(sb->s_flags & MS_RDONLY) &&
> +	    ufs->config.consistent_fd != ovl_consistent_fd_def)
> +		seq_printf(m, ",consistent_fd=%s",
> +			   ufs->config.consistent_fd ? "on" : "off");
>  	return 0;
>  }
>  
> @@ -256,6 +269,8 @@ enum {
>  	OPT_DEFAULT_PERMISSIONS,
>  	OPT_REDIRECT_DIR_ON,
>  	OPT_REDIRECT_DIR_OFF,
> +	OPT_CONSISTENT_FD_ON,
> +	OPT_CONSISTENT_FD_OFF,
>  	OPT_ERR,
>  };
>  
> @@ -266,6 +281,8 @@ static const match_table_t ovl_tokens = {
>  	{OPT_DEFAULT_PERMISSIONS,	"default_permissions"},
>  	{OPT_REDIRECT_DIR_ON,		"redirect_dir=on"},
>  	{OPT_REDIRECT_DIR_OFF,		"redirect_dir=off"},
> +	{OPT_CONSISTENT_FD_ON,		"consistent_fd=on"},
> +	{OPT_CONSISTENT_FD_OFF,		"consistent_fd=off"},
>  	{OPT_ERR,			NULL}
>  };
>  
> @@ -338,6 +355,14 @@ static int ovl_parse_opt(char *opt, struct ovl_config *config)
>  			config->redirect_dir = false;
>  			break;
>  
> +		case OPT_CONSISTENT_FD_ON:
> +			config->consistent_fd = true;
> +			break;
> +
> +		case OPT_CONSISTENT_FD_OFF:
> +			config->consistent_fd = false;
> +			break;
> +
>  		default:
>  			pr_err("overlayfs: unrecognized mount option \"%s\" or missing value\n", p);
>  			return -EINVAL;
> @@ -732,6 +757,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  
>  	init_waitqueue_head(&ufs->copyup_wq);
>  	ufs->config.redirect_dir = ovl_redirect_dir_def;
> +	ufs->config.consistent_fd = ovl_consistent_fd_def;
>  	err = ovl_parse_opt((char *) data, &ufs->config);
>  	if (err)
>  		goto out_free_config;
> @@ -862,11 +888,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  			if (!err)
>  				pr_warn("overlayfs: upper fs needs to support d_type.\n");
>  
> -			/* Check if upper/work fs supports O_TMPFILE */
> +			/* Check if upper fs supports O_TMPFILE and clone */
>  			temp = ovl_do_tmpfile(ufs->workdir, S_IFREG | 0);
>  			ufs->tmpfile = !IS_ERR(temp);
>  			if (ufs->tmpfile) {
> -				/* Check if upper/work supports clone */
>  				if (temp->d_inode && temp->d_inode->i_fop &&
>  				    temp->d_inode->i_fop->clone_file_range)
>  					ufs->cloneup = true;
> @@ -910,6 +935,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent)
>  	if (!ufs->upper_mnt)
>  		sb->s_flags |= MS_RDONLY;
>  
> +	/* Copy on read for consistent fd depends on clone support */
> +	if (!ufs->cloneup)
> +		ufs->config.consistent_fd = false;
> +
>  	if (remote)
>  		sb->s_d_op = &ovl_reval_dentry_operations;
>  	else
> diff --git a/fs/overlayfs/util.c b/fs/overlayfs/util.c
> index 159e851..be0a993 100644
> --- a/fs/overlayfs/util.c
> +++ b/fs/overlayfs/util.c
> @@ -195,6 +195,13 @@ void ovl_dentry_set_opaque(struct dentry *dentry)
>  	oe->__type |= __OVL_PATH_OPAQUE;
>  }
>  
> +bool ovl_consistent_fd(struct super_block *sb)
> +{
> +	struct ovl_fs *ofs = sb->s_fs_info;
> +
> +	return ofs->config.consistent_fd;
> +}
> +
>  bool ovl_redirect_dir(struct super_block *sb)
>  {
>  	struct ovl_fs *ofs = sb->s_fs_info;
> -- 
> 2.7.4
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-unionfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux