On Tue, Jun 13, 2023 at 5:49 PM Christian Brauner <brauner@xxxxxxxxxx> wrote: > > We ran into issues where mount(8) passed multiple lower layers as one > big string through fsconfig(). But the fsconfig() FSCONFIG_SET_STRING > option is limited to 256 bytes in strndup_user(). While this would be > fixable by extending the fsconfig() buffer I'd rather encourage users to > append layers via multiple fsconfig() calls as the interface allows > nicely for this. This has also been requested as a feature before. > > With this port to the new mount api the following will be possible: > > fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", "/lower1", 0); > > /* set upper layer */ > fsconfig(fs_fd, FSCONFIG_SET_STRING, "upperdir", "/upper", 0); > > /* append "/lower2", "/lower3", and "/lower4" */ > fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower2:/lower3:/lower4", 0); > > /* turn index feature on */ > fsconfig(fs_fd, FSCONFIG_SET_STRING, "index", "on", 0); > > /* append "/lower5" */ > fsconfig(fs_fd, FSCONFIG_SET_STRING, "lowerdir", ":/lower5", 0); > > Specifying ':' would have been rejected so this isn't a regression. And > we can't simply use "lowerdir=/lower" to append on top of existing > layers as "lowerdir=/lower,lowerdir=/other-lower" would make > "/other-lower" the only lower layer so we'd break uapi if we changed > this. So the ':' prefix seems a good compromise. > > Users can choose to specify multiple layers at once or individual > layers. A layer is appended if it starts with ":". This requires that > the user has already added at least one layer before. If lowerdir is > specified again without a leading ":" then all previous layers are > dropped and replaced with the new layers. If lowerdir is specified and > empty than all layers are simply dropped. > > An additional change is that overlayfs will now parse and resolve layers > right when they are specified in fsconfig() instead of deferring until > super block creation. This allows users to receive early errors. > > It also allows users to actually use up to 500 layers something which > was theoretically possible but ended up not working due to the mount > option string passed via mount(2) being too large. > > This also allows a more privileged process to set config options for a > lesser privileged process as the creds for fsconfig() and the creds for > fsopen() can differ. We could restrict that they match by enforcing that > the creds of fsopen() and fsconfig() match but I don't see why that > needs to be the case and allows for a good delegation mechanism. > > Plus, in the future it means we're able to extend overlayfs mount > options and allow users to specify layers via file descriptors instead > of paths: > > fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower1", dirfd); > > /* append */ > fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower2", dirfd); > > /* append */ > fsconfig(FSCONFIG_SET_PATH{_EMPTY}, "lowerdir", "lower3", dirfd); > > /* clear all layers specified until now */ > fsconfig(FSCONFIG_SET_STRING, "lowerdir", NULL, 0); > > This would be especially nice if users create an overlayfs mount on top > of idmapped layers or just in general private mounts created via > open_tree(OPEN_TREE_CLONE). Those mounts would then never have to appear > anywhere in the filesystem. But for now just do the minimal thing. > > We should probably aim to move more validation into ovl_fs_parse_param() > so users get errors before fsconfig(FSCONFIG_CMD_CREATE). But that can > be done in additional patches later. > > This is now also rebased on top of the lazy lowerdata lookup which > allows the specificatin of data only layers using the new "::" syntax. > > The rules are simple. A data only layers cannot be followed by any > regular layers and data layers must be preceeded by at least one regular > layer. > > Parsing the lowerdir mount option must change because of this. The > original patchset used the old lowerdir parsing function to split a > lowerdir mount option string such as: > > lowerdir=/lower1:/lower2::/lower3::/lower4 > > simply replacing each non-escaped ":" by "\0". So sequences of > non-escaped ":" were counted as layers. For example, the previous > lowerdir mount option above would've counted 6 layers instead of 4 and a > lowerdir mount option such as: > > lowerdir="/lower1:/lower2::/lower3::/lower4:::::::::::::::::::::::::::" > > would be counted as 33 layers. Other than being ugly this didn't matter > much because kern_path() would reject the first "\0" layer. However, > this overcounting of layers becomes problematic when we base allocations > on it where we very much only want to allocate space for 4 layers > instead of 33. > > So the new parsing function rejects non-escaped sequences of colons > other than ":" and "::" immediately instead of relying on kern_path(). > > Link: https://github.com/util-linux/util-linux/issues/2287 > Link: https://github.com/util-linux/util-linux/issues/1992 > Link: https://bugs.archlinux.org/task/78702 > Link: https://lore.kernel.org/linux-unionfs/20230530-klagen-zudem-32c0908c2108@brauner > Signed-off-by: Christian Brauner <brauner@xxxxxxxxxx> > --- > fs/overlayfs/Makefile | 2 +- > fs/overlayfs/overlayfs.h | 23 +++ > fs/overlayfs/ovl_entry.h | 3 +- > fs/overlayfs/params.c | 388 +++++++++++++++++++++++++++++++++++++++++++++++ > fs/overlayfs/super.c | 376 +++++++++++++++------------------------------ > 5 files changed, 534 insertions(+), 258 deletions(-) > > diff --git a/fs/overlayfs/Makefile b/fs/overlayfs/Makefile > index 9164c585eb2f..4e173d56b11f 100644 > --- a/fs/overlayfs/Makefile > +++ b/fs/overlayfs/Makefile > @@ -6,4 +6,4 @@ > obj-$(CONFIG_OVERLAY_FS) += overlay.o > > overlay-objs := super.o namei.o util.o inode.o file.o dir.o readdir.o \ > - copy_up.o export.o > + copy_up.o export.o params.o > diff --git a/fs/overlayfs/overlayfs.h b/fs/overlayfs/overlayfs.h > index fcac4e2c56ab..7659ea6e02cb 100644 > --- a/fs/overlayfs/overlayfs.h > +++ b/fs/overlayfs/overlayfs.h > @@ -119,6 +119,29 @@ struct ovl_fh { > #define OVL_FH_FID_OFFSET (OVL_FH_WIRE_OFFSET + \ > offsetof(struct ovl_fb, fid)) > > +/* params.c */ > +#define OVL_MAX_STACK 500 > + > +struct ovl_fs_context_layer { > + char *name; > + struct path path; > +}; > + > +struct ovl_fs_context { > + struct path upper; > + struct path work; > + size_t capacity; > + size_t nr; /* includes nr_data */ > + size_t nr_data; > + u8 set; > + struct ovl_fs_context_layer *lower; > +}; > + > +int ovl_parse_param_upperdir(const char *name, struct fs_context *fc, > + bool workdir); > +int ovl_parse_param_lowerdir(const char *name, struct fs_context *fc); > +void ovl_parse_param_drop_lowerdir(struct ovl_fs_context *ctx); > + > extern const char *const ovl_xattr_table[][2]; > static inline const char *ovl_xattr(struct ovl_fs *ofs, enum ovl_xattr ox) > { > diff --git a/fs/overlayfs/ovl_entry.h b/fs/overlayfs/ovl_entry.h > index c72433c06006..7888ab33730b 100644 > --- a/fs/overlayfs/ovl_entry.h > +++ b/fs/overlayfs/ovl_entry.h > @@ -6,7 +6,6 @@ > */ > > struct ovl_config { > - char *lowerdir; > char *upperdir; > char *workdir; > bool default_permissions; > @@ -41,6 +40,7 @@ struct ovl_layer { > int idx; > /* One fsid per unique underlying sb (upper fsid == 0) */ > int fsid; > + char *name; > }; > > /* > @@ -101,7 +101,6 @@ struct ovl_fs { > errseq_t errseq; > }; > > - > /* Number of lower layers, not including data-only layers */ > static inline unsigned int ovl_numlowerlayer(struct ovl_fs *ofs) > { > diff --git a/fs/overlayfs/params.c b/fs/overlayfs/params.c > new file mode 100644 > index 000000000000..a1606af1613f > --- /dev/null > +++ b/fs/overlayfs/params.c > @@ -0,0 +1,388 @@ > +// SPDX-License-Identifier: GPL-2.0-only > + > +#include <linux/fs.h> > +#include <linux/namei.h> > +#include <linux/fs_context.h> > +#include <linux/fs_parser.h> > +#include <linux/posix_acl_xattr.h> > +#include <linux/xattr.h> > +#include "overlayfs.h" > + > +static ssize_t ovl_parse_param_split_lowerdirs(char *str) > +{ > + ssize_t nr_layers = 1, nr_colons = 0; > + char *s, *d; > + > + for (s = d = str;; s++, d++) { > + if (*s == '\\') { > + s++; > + } else if (*s == ':') { > + bool next_colon = (*(s + 1) == ':'); > + > + nr_colons++; > + if (nr_colons == 2 && next_colon) { > + pr_err("only single ':' or double '::' sequences of unescaped colons in lowerdir mount option allowed.\n"); > + return -EINVAL; > + } > + /* count layers, not colons */ > + if (!next_colon) > + nr_layers++; > + > + *d = '\0'; > + continue; > + } > + > + *d = *s; > + if (!*s) { > + /* trailing colons */ > + if (nr_colons) { > + pr_err("unescaped trailing colons in lowerdir mount option.\n"); > + return -EINVAL; > + } > + break; > + } > + nr_colons = 0; > + } > + > + return nr_layers; > +} > + > +static int ovl_mount_dir_noesc(const char *name, struct path *path) > +{ > + int err = -EINVAL; > + > + if (!*name) { > + pr_err("empty lowerdir\n"); > + goto out; > + } > + err = kern_path(name, LOOKUP_FOLLOW, path); > + if (err) { > + pr_err("failed to resolve '%s': %i\n", name, err); > + goto out; > + } > + err = -EINVAL; > + if (ovl_dentry_weird(path->dentry)) { > + pr_err("filesystem on '%s' not supported\n", name); > + goto out_put; > + } > + if (!d_is_dir(path->dentry)) { > + pr_err("'%s' not a directory\n", name); > + goto out_put; > + } > + return 0; > + > +out_put: > + path_put_init(path); > +out: > + return err; > +} > + > +static void ovl_unescape(char *s) > +{ > + char *d = s; > + > + for (;; s++, d++) { > + if (*s == '\\') > + s++; > + *d = *s; > + if (!*s) > + break; > + } > +} > + > +static int ovl_mount_dir(const char *name, struct path *path) > +{ > + int err = -ENOMEM; > + char *tmp = kstrdup(name, GFP_KERNEL); > + > + if (tmp) { > + ovl_unescape(tmp); > + err = ovl_mount_dir_noesc(tmp, path); > + > + if (!err && path->dentry->d_flags & DCACHE_OP_REAL) { > + pr_err("filesystem on '%s' not supported as upperdir\n", > + tmp); > + path_put_init(path); > + err = -EINVAL; > + } > + kfree(tmp); > + } > + return err; > +} > + > +int ovl_parse_param_upperdir(const char *name, struct fs_context *fc, > + bool workdir) > +{ > + int err; > + struct ovl_fs *ofs = fc->s_fs_info; > + struct ovl_config *config = &ofs->config; > + struct ovl_fs_context *ctx = fc->fs_private; > + struct path path; > + char *dup; > + > + err = ovl_mount_dir(name, &path); > + if (err) > + return err; > + > + /* > + * Check whether upper path is read-only here to report failures > + * early. Don't forget to recheck when the superblock is created > + * as the mount attributes could change. > + */ > + if (__mnt_is_readonly(path.mnt)) { > + path_put(&path); > + return -EINVAL; > + } > + > + dup = kstrdup(name, GFP_KERNEL); > + if (!dup) { > + path_put(&path); > + return -ENOMEM; > + } > + > + if (workdir) { > + kfree(config->workdir); > + config->workdir = dup; > + path_put(&ctx->work); > + ctx->work = path; > + } else { > + kfree(config->upperdir); > + config->upperdir = dup; > + path_put(&ctx->upper); > + ctx->upper = path; > + } > + return 0; > +} > + > +void ovl_parse_param_drop_lowerdir(struct ovl_fs_context *ctx) > +{ > + for (size_t nr = 0; nr < ctx->nr; nr++) { > + path_put(&ctx->lower[nr].path); > + kfree(ctx->lower[nr].name); > + ctx->lower[nr].name = NULL; > + } > + ctx->nr = 0; > + ctx->nr_data = 0; > +} > + > +/* > + * Parse lowerdir= mount option: > + * > + * (1) lowerdir=/lower1:/lower2:/lower3::/data1::/data2 > + * Set "/lower1", "/lower2", and "/lower3" as lower layers and > + * "/data1" and "/data2" as data lower layers. Any existing lower > + * layers are replaced. > + * (2) lowerdir=:/lower4 > + * Append "/lower4" to current stack of lower layers. This requires > + * that there already is at least one lower layer configured. > + * (3) lowerdir=::/lower5 > + * Append data "/lower5" as data lower layer. This requires that > + * there's at least one regular lower layer present. > + */ > +int ovl_parse_param_lowerdir(const char *name, struct fs_context *fc) > +{ > + int err; > + struct ovl_fs_context *ctx = fc->fs_private; > + struct ovl_fs_context_layer *l; > + char *dup = NULL, *dup_iter; > + ssize_t nr_lower = 0, nr = 0, nr_data = 0; > + bool append = false, data_layer = false; > + > + /* > + * Ensure we're backwards compatible with mount(2) > + * by allowing relative paths. > + */ > + > + /* drop all existing lower layers */ > + if (!*name) { > + ovl_parse_param_drop_lowerdir(ctx); > + return 0; > + } > + > + if (strncmp(name, "::", 2) == 0) { > + /* > + * This is a data layer. > + * There must be at least one regular lower layer > + * specified. > + */ > + if (ctx->nr == 0) { > + pr_err("data lower layers without regular lower layers not allowed"); > + return -EINVAL; > + } > + > + /* Skip the leading "::". */ > + name += 2; > + data_layer = true; > + /* > + * A data layer is automatically an append as there > + * must've been at least one regular lower layer. > + */ > + append = true; > + } else if (*name == ':') { > + /* > + * This is a regular lower layer. > + * If users want to append a layer enforce that they > + * have already specified a first layer before. It's > + * better to be strict. > + */ > + if (ctx->nr == 0) { > + pr_err("cannot append layer if no previous layer has been specified"); > + return -EINVAL; > + } > + > + /* > + * Once a sequence of data layers has started regular > + * lower layers are forbidden. > + */ > + if (ctx->nr_data > 0) { > + pr_err("regular lower layers cannot follow data lower layers"); > + return -EINVAL; > + } > + > + /* Skip the leading ":". */ > + name++; > + append = true; > + } > + > + dup = kstrdup(name, GFP_KERNEL); > + if (!dup) > + return -ENOMEM; > + > + err = -EINVAL; > + nr_lower = ovl_parse_param_split_lowerdirs(dup); > + if (nr_lower < 0) > + goto out_err; > + > + if ((nr_lower > OVL_MAX_STACK) || > + (append && (size_add(ctx->nr, nr_lower) > OVL_MAX_STACK))) { > + pr_err("too many lower directories, limit is %d\n", OVL_MAX_STACK); > + goto out_err; > + } > + > + if (!append) > + ovl_parse_param_drop_lowerdir(ctx); > + > + /* > + * (1) append > + * > + * We want nr <= nr_lower <= capacity We know nr > 0 and nr <= > + * capacity. If nr == 0 this wouldn't be append. If nr + > + * nr_lower is <= capacity then nr <= nr_lower <= capacity > + * already holds. If nr + nr_lower exceeds capacity, we realloc. > + * > + * (2) replace > + * > + * Ensure we're backwards compatible with mount(2) which allows > + * "lowerdir=/a:/b:/c,lowerdir=/d:/e:/f" causing the last > + * specified lowerdir mount option to win. > + * > + * We want nr <= nr_lower <= capacity We know either (i) nr == 0 > + * or (ii) nr > 0. We also know nr_lower > 0. The capacity > + * could've been changed multiple times already so we only know > + * nr <= capacity. If nr + nr_lower > capacity we realloc, > + * otherwise nr <= nr_lower <= capacity holds already. > + */ > + nr_lower += ctx->nr; > + if (nr_lower > ctx->capacity) { > + err = -ENOMEM; > + l = krealloc_array(ctx->lower, nr_lower, sizeof(*ctx->lower), > + GFP_KERNEL_ACCOUNT); > + if (!l) > + goto out_err; > + > + ctx->lower = l; > + ctx->capacity = nr_lower; > + } > + > + /* > + * (3) By (1) and (2) we know nr <= nr_lower <= capacity. > + * (4) If ctx->nr == 0 => replace > + * We have verified above that the lowerdir mount option > + * isn't an append, i.e., the lowerdir mount option > + * doesn't start with ":" or "::". > + * (4.1) The lowerdir mount options only contains regular lower > + * layers ":". > + * => Nothing to verify. > + * (4.2) The lowerdir mount options contains regular ":" and > + * data "::" layers. > + * => We need to verify that data lower layers "::" aren't > + * followed by regular ":" lower layers > + * (5) If ctx->nr > 0 => append > + * We know that there's at least one regular layer > + * otherwise we would've failed when parsing the previous > + * lowerdir mount option. > + * (5.1) The lowerdir mount option is a regular layer ":" append > + * => We need to verify that no data layers have been > + * specified before. > + * (5.2) The lowerdir mount option is a data layer "::" append > + * We know that there's at least one regular layer or > + * other data layers. => There's nothing to verify. > + */ > + dup_iter = dup; > + for (nr = ctx->nr; nr < nr_lower; nr++) { > + l = &ctx->lower[nr]; > + > + err = ovl_mount_dir_noesc(dup_iter, &l->path); > + if (err) > + goto out_put; > + > + err = -ENOMEM; > + l->name = kstrdup(dup_iter, GFP_KERNEL_ACCOUNT); > + if (!l->name) > + goto out_put; > + > + if (data_layer) > + nr_data++; > + > + /* Calling strchr() again would overrun. */ > + if ((nr + 1) == nr_lower) > + break; > + > + err = -EINVAL; > + dup_iter = strchr(dup_iter, '\0') + 1; > + if (*dup_iter) { > + /* > + * This is a regular layer so we require that > + * there are no data layers. > + */ > + if ((ctx->nr_data + nr_data) > 0) { > + pr_err("regular lower layers cannot follow data lower layers"); > + goto out_put; > + } > + > + data_layer = false; > + continue; > + } > + > + /* This is a data lower layer. */ > + data_layer = true; > + dup_iter++; > + } > + ctx->nr = nr_lower; > + ctx->nr_data += nr_data; > + kfree(dup); > + return 0; > + > +out_put: > + /* > + * We know nr >= ctx->nr < nr_lower. If we failed somewhere > + * we want to undo until nr == ctx->nr. This is correct for > + * both ctx->nr == 0 and ctx->nr > 0. > + */ > + for (; nr >= ctx->nr; nr--) { > + l = &ctx->lower[nr]; > + kfree(l->name); > + l->name = NULL; > + path_put(&l->path); > + > + /* don't overflow */ > + if (nr == 0) > + break; > + } > + > +out_err: > + kfree(dup); > + > + /* Intentionally don't realloc to a smaller size. */ > + return err; > +} > diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c > index 3392dc5d2082..b73b14c52961 100644 > --- a/fs/overlayfs/super.c > +++ b/fs/overlayfs/super.c > @@ -27,8 +27,6 @@ MODULE_LICENSE("GPL"); > > struct ovl_dir_cache; > > -#define OVL_MAX_STACK 500 > - > static bool ovl_redirect_dir_def = IS_ENABLED(CONFIG_OVERLAY_FS_REDIRECT_DIR); > module_param_named(redirect_dir, ovl_redirect_dir_def, bool, 0644); > MODULE_PARM_DESC(redirect_dir, > @@ -109,8 +107,11 @@ static const char *ovl_redirect_mode(struct ovl_config *config) > return ovl_parameter_redirect_dir[config->redirect_mode].name; > } > > +#define fsparam_string_empty(NAME, OPT) \ > + __fsparam(fs_param_is_string, NAME, OPT, fs_param_can_be_empty, NULL) > + > static const struct fs_parameter_spec ovl_parameter_spec[] = { > - fsparam_string("lowerdir", Opt_lowerdir), > + fsparam_string_empty("lowerdir", Opt_lowerdir), > fsparam_string("upperdir", Opt_upperdir), > fsparam_string("workdir", Opt_workdir), > fsparam_flag("default_permissions", Opt_default_permissions), > @@ -125,15 +126,15 @@ static const struct fs_parameter_spec ovl_parameter_spec[] = { > {} > }; > > +/* > + * These options imply different behavior when they are explicitly > + * specified than when they are left in their default state. > + */ > #define OVL_METACOPY_SET BIT(0) > #define OVL_REDIRECT_SET BIT(1) > #define OVL_NFS_EXPORT_SET BIT(2) > #define OVL_INDEX_SET BIT(3) > > -struct ovl_fs_context { > - u8 set; > -}; > - > static struct dentry *ovl_d_real(struct dentry *dentry, > const struct inode *inode) > { > @@ -308,6 +309,7 @@ static void ovl_free_fs(struct ovl_fs *ofs) > for (i = 0; i < ofs->numlayer; i++) { > iput(ofs->layers[i].trap); > mounts[i] = ofs->layers[i].mnt; > + kfree(ofs->layers[i].name); > } > kern_unmount_array(mounts, ofs->numlayer); > kfree(ofs->layers); > @@ -315,7 +317,6 @@ static void ovl_free_fs(struct ovl_fs *ofs) > free_anon_bdev(ofs->fs[i].pseudo_dev); > kfree(ofs->fs); > > - kfree(ofs->config.lowerdir); > kfree(ofs->config.upperdir); > kfree(ofs->config.workdir); > if (ofs->creator_cred) > @@ -433,8 +434,17 @@ static int ovl_show_options(struct seq_file *m, struct dentry *dentry) > struct super_block *sb = dentry->d_sb; > struct ovl_fs *ofs = sb->s_fs_info; > const char *redirect_mode; > - > - seq_show_option(m, "lowerdir", ofs->config.lowerdir); > + size_t nr, nr_merged_lower = ofs->numlayer - ofs->numdatalayer; > + const struct ovl_layer *data_layers = &ofs->layers[nr_merged_lower]; > + > + /* ofs->layers[0] is the upper layer */ > + seq_printf(m, ",lowerdir=%s", ofs->layers[1].name); > + /* dump regular lower layers */ > + for (nr = 2; nr < nr_merged_lower; nr++) > + seq_printf(m, ":%s", ofs->layers[nr].name); > + /* dump data lower layers */ > + for (nr = 0; nr < ofs->numdatalayer; nr++) > + seq_printf(m, "::%s", data_layers[nr].name); > if (ofs->config.upperdir) { > seq_show_option(m, "upperdir", ofs->config.upperdir); > seq_show_option(m, "workdir", ofs->config.workdir); > @@ -509,6 +519,11 @@ static int ovl_fs_params_verify(const struct ovl_fs_context *ctx, > bool nfs_export_opt = ctx->set & OVL_NFS_EXPORT_SET; > bool index_opt = ctx->set & OVL_INDEX_SET; > > + if (ctx->nr_data > 0 && !config->metacopy) { > + pr_err("lower data-only dirs require metacopy support.\n"); > + return -EINVAL; > + } > + > /* Workdir/index are useless in non-upper mount */ > if (!config->upperdir) { > if (config->workdir) { > @@ -723,69 +738,6 @@ static struct dentry *ovl_workdir_create(struct ovl_fs *ofs, > goto out_unlock; > } > > -static void ovl_unescape(char *s) > -{ > - char *d = s; > - > - for (;; s++, d++) { > - if (*s == '\\') > - s++; > - *d = *s; > - if (!*s) > - break; > - } > -} > - > -static int ovl_mount_dir_noesc(const char *name, struct path *path) > -{ > - int err = -EINVAL; > - > - if (!*name) { > - pr_err("empty lowerdir\n"); > - goto out; > - } > - err = kern_path(name, LOOKUP_FOLLOW, path); > - if (err) { > - pr_err("failed to resolve '%s': %i\n", name, err); > - goto out; > - } > - err = -EINVAL; > - if (ovl_dentry_weird(path->dentry)) { > - pr_err("filesystem on '%s' not supported\n", name); > - goto out_put; > - } > - if (!d_is_dir(path->dentry)) { > - pr_err("'%s' not a directory\n", name); > - goto out_put; > - } > - return 0; > - > -out_put: > - path_put_init(path); > -out: > - return err; > -} > - > -static int ovl_mount_dir(const char *name, struct path *path) > -{ > - int err = -ENOMEM; > - char *tmp = kstrdup(name, GFP_KERNEL); > - > - if (tmp) { > - ovl_unescape(tmp); > - err = ovl_mount_dir_noesc(tmp, path); > - > - if (!err && path->dentry->d_flags & DCACHE_OP_REAL) { > - pr_err("filesystem on '%s' not supported as upperdir\n", > - tmp); > - path_put_init(path); > - err = -EINVAL; > - } > - kfree(tmp); > - } > - return err; > -} > - > static int ovl_check_namelen(const struct path *path, struct ovl_fs *ofs, > const char *name) > { > @@ -806,10 +758,6 @@ static int ovl_lower_dir(const char *name, struct path *path, > int fh_type; > int err; > > - err = ovl_mount_dir_noesc(name, path); > - if (err) > - return err; > - > err = ovl_check_namelen(path, ofs, name); > if (err) > return err; > @@ -858,26 +806,6 @@ static bool ovl_workdir_ok(struct dentry *workdir, struct dentry *upperdir) > return ok; > } > > -static unsigned int ovl_split_lowerdirs(char *str) > -{ > - unsigned int ctr = 1; > - char *s, *d; > - > - for (s = d = str;; s++, d++) { > - if (*s == '\\') { > - s++; > - } else if (*s == ':') { > - *d = '\0'; > - ctr++; > - continue; > - } > - *d = *s; > - if (!*s) > - break; > - } > - return ctr; > -} > - > static int ovl_own_xattr_get(const struct xattr_handler *handler, > struct dentry *dentry, struct inode *inode, > const char *name, void *buffer, size_t size) > @@ -978,15 +906,12 @@ static int ovl_report_in_use(struct ovl_fs *ofs, const char *name) > } > > static int ovl_get_upper(struct super_block *sb, struct ovl_fs *ofs, > - struct ovl_layer *upper_layer, struct path *upperpath) > + struct ovl_layer *upper_layer, > + const struct path *upperpath) > { > struct vfsmount *upper_mnt; > int err; > > - err = ovl_mount_dir(ofs->config.upperdir, upperpath); > - if (err) > - goto out; > - > /* Upperdir path should not be r/o */ > if (__mnt_is_readonly(upperpath->mnt)) { > pr_err("upper fs is r/o, try multi-lower layers mount\n"); > @@ -1016,6 +941,11 @@ static int ovl_get_upper(struct super_block *sb, struct ovl_fs *ofs, > upper_layer->idx = 0; > upper_layer->fsid = 0; > > + err = -ENOMEM; > + upper_layer->name = kstrdup(ofs->config.upperdir, GFP_KERNEL); > + if (!upper_layer->name) > + goto out; > + > /* > * Inherit SB_NOSEC flag from upperdir. > * > @@ -1273,46 +1203,37 @@ static int ovl_make_workdir(struct super_block *sb, struct ovl_fs *ofs, > } > > static int ovl_get_workdir(struct super_block *sb, struct ovl_fs *ofs, > - const struct path *upperpath) > + const struct path *upperpath, > + const struct path *workpath) > { > int err; > - struct path workpath = { }; > - > - err = ovl_mount_dir(ofs->config.workdir, &workpath); > - if (err) > - goto out; > > err = -EINVAL; > - if (upperpath->mnt != workpath.mnt) { > + if (upperpath->mnt != workpath->mnt) { > pr_err("workdir and upperdir must reside under the same mount\n"); > - goto out; > + return err; > } > - if (!ovl_workdir_ok(workpath.dentry, upperpath->dentry)) { > + if (!ovl_workdir_ok(workpath->dentry, upperpath->dentry)) { > pr_err("workdir and upperdir must be separate subtrees\n"); > - goto out; > + return err; > } > > - ofs->workbasedir = dget(workpath.dentry); > + ofs->workbasedir = dget(workpath->dentry); > > if (ovl_inuse_trylock(ofs->workbasedir)) { > ofs->workdir_locked = true; > } else { > err = ovl_report_in_use(ofs, "workdir"); > if (err) > - goto out; > + return err; > } > > err = ovl_setup_trap(sb, ofs->workbasedir, &ofs->workbasedir_trap, > "workdir"); > if (err) > - goto out; > - > - err = ovl_make_workdir(sb, ofs, &workpath); > - > -out: > - path_put(&workpath); > + return err; > > - return err; > + return ovl_make_workdir(sb, ofs, workpath); > } > > static int ovl_get_indexdir(struct super_block *sb, struct ovl_fs *ofs, > @@ -1476,13 +1397,13 @@ static int ovl_get_data_fsid(struct ovl_fs *ofs) > > > static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs, > - struct path *stack, unsigned int numlower, > - struct ovl_layer *layers) > + struct ovl_fs_context *ctx, struct ovl_layer *layers) > { > int err; > unsigned int i; > + size_t nr_merged_lower; > > - ofs->fs = kcalloc(numlower + 2, sizeof(struct ovl_sb), GFP_KERNEL); > + ofs->fs = kcalloc(ctx->nr + 2, sizeof(struct ovl_sb), GFP_KERNEL); > if (ofs->fs == NULL) > return -ENOMEM; > > @@ -1509,13 +1430,15 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs, > ofs->fs[0].is_lower = false; > } > > - for (i = 0; i < numlower; i++) { > + nr_merged_lower = ctx->nr - ctx->nr_data; > + for (i = 0; i < ctx->nr; i++) { > + struct ovl_fs_context_layer *l = &ctx->lower[i]; > struct vfsmount *mnt; > struct inode *trap; > int fsid; > > - if (i < numlower - ofs->numdatalayer) > - fsid = ovl_get_fsid(ofs, &stack[i]); > + if (i < nr_merged_lower) > + fsid = ovl_get_fsid(ofs, &l->path); > else > fsid = ovl_get_data_fsid(ofs); > if (fsid < 0) > @@ -1528,11 +1451,11 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs, > * the upperdir/workdir is in fact in-use by our > * upperdir/workdir. > */ > - err = ovl_setup_trap(sb, stack[i].dentry, &trap, "lowerdir"); > + err = ovl_setup_trap(sb, l->path.dentry, &trap, "lowerdir"); > if (err) > return err; > > - if (ovl_is_inuse(stack[i].dentry)) { > + if (ovl_is_inuse(l->path.dentry)) { > err = ovl_report_in_use(ofs, "lowerdir"); > if (err) { > iput(trap); > @@ -1540,7 +1463,7 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs, > } > } > > - mnt = clone_private_mount(&stack[i]); > + mnt = clone_private_mount(&l->path); > err = PTR_ERR(mnt); > if (IS_ERR(mnt)) { > pr_err("failed to clone lowerpath\n"); > @@ -1559,6 +1482,8 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs, > layers[ofs->numlayer].idx = ofs->numlayer; > layers[ofs->numlayer].fsid = fsid; > layers[ofs->numlayer].fs = &ofs->fs[fsid]; > + layers[ofs->numlayer].name = l->name; > + l->name = NULL; > ofs->numlayer++; > ofs->fs[fsid].is_lower = true; > } > @@ -1599,104 +1524,59 @@ static int ovl_get_layers(struct super_block *sb, struct ovl_fs *ofs, > } > > static struct ovl_entry *ovl_get_lowerstack(struct super_block *sb, > - const char *lower, unsigned int numlower, > - struct ovl_fs *ofs, struct ovl_layer *layers) > + struct ovl_fs_context *ctx, > + struct ovl_fs *ofs, > + struct ovl_layer *layers) > { > int err; > - struct path *stack = NULL; > - struct ovl_path *lowerstack; > - unsigned int numlowerdata = 0; > unsigned int i; > + size_t nr_merged_lower; > struct ovl_entry *oe; > + struct ovl_path *lowerstack; > > - if (!ofs->config.upperdir && numlower == 1) { > + struct ovl_fs_context_layer *l; > + > + if (!ofs->config.upperdir && ctx->nr == 1) { > pr_err("at least 2 lowerdir are needed while upperdir nonexistent\n"); > return ERR_PTR(-EINVAL); > } > > - stack = kcalloc(numlower, sizeof(struct path), GFP_KERNEL); > - if (!stack) > - return ERR_PTR(-ENOMEM); > + err = -EINVAL; > + for (i = 0; i < ctx->nr; i++) { > + l = &ctx->lower[i]; > > - for (i = 0; i < numlower;) { > - err = ovl_lower_dir(lower, &stack[i], ofs, &sb->s_stack_depth); > + err = ovl_lower_dir(l->name, &l->path, ofs, &sb->s_stack_depth); > if (err) > - goto out_err; > - > - lower = strchr(lower, '\0') + 1; > - > - i++; > - if (i == numlower) > - break; > - > - err = -EINVAL; > - /* > - * Empty lower layer path could mean :: separator that indicates > - * a data-only lower data. > - * Several data-only layers are allowed, but they all need to be > - * at the bottom of the stack. > - */ > - if (*lower) { > - /* normal lower dir */ > - if (numlowerdata) { > - pr_err("lower data-only dirs must be at the bottom of the stack.\n"); > - goto out_err; > - } > - } else { > - /* data-only lower dir */ > - if (!ofs->config.metacopy) { > - pr_err("lower data-only dirs require metacopy support.\n"); > - goto out_err; > - } > - if (i == numlower - 1) { > - pr_err("lowerdir argument must not end with double colon.\n"); > - goto out_err; > - } > - lower++; > - numlower--; > - numlowerdata++; > - } > - } > - > - if (numlowerdata) { > - ofs->numdatalayer = numlowerdata; > - pr_info("using the lowest %d of %d lowerdirs as data layers\n", > - numlowerdata, numlower); > + return ERR_PTR(err); > } > > err = -EINVAL; > sb->s_stack_depth++; > if (sb->s_stack_depth > FILESYSTEM_MAX_STACK_DEPTH) { > pr_err("maximum fs stacking depth exceeded\n"); > - goto out_err; > + return ERR_PTR(err); > } > > - err = ovl_get_layers(sb, ofs, stack, numlower, layers); > + err = ovl_get_layers(sb, ofs, ctx, layers); > if (err) > - goto out_err; > + return ERR_PTR(err); > > err = -ENOMEM; > /* Data-only layers are not merged in root directory */ > - oe = ovl_alloc_entry(numlower - numlowerdata); > + nr_merged_lower = ctx->nr - ctx->nr_data; > + oe = ovl_alloc_entry(nr_merged_lower); > if (!oe) > - goto out_err; > + return ERR_PTR(err); > > lowerstack = ovl_lowerstack(oe); > - for (i = 0; i < numlower - numlowerdata; i++) { > - lowerstack[i].dentry = dget(stack[i].dentry); > - lowerstack[i].layer = &ofs->layers[i+1]; > + for (i = 0; i < nr_merged_lower; i++) { > + l = &ctx->lower[i]; > + lowerstack[i].dentry = dget(l->path.dentry); > + lowerstack[i].layer = &ofs->layers[i + 1]; > } > - > -out: > - for (i = 0; i < numlower; i++) > - path_put(&stack[i]); > - kfree(stack); > + ofs->numdatalayer = ctx->nr_data; > > return oe; > - > -out_err: > - oe = ERR_PTR(err); > - goto out; > } > > /* > @@ -1804,6 +1684,12 @@ static struct dentry *ovl_get_root(struct super_block *sb, > ovl_set_upperdata(d_inode(root)); > ovl_inode_init(d_inode(root), &oip, ino, fsid); > ovl_dentry_init_flags(root, upperdentry, oe, DCACHE_OP_WEAK_REVALIDATE); > + /* > + * We're going to put upper path when we call > + * fs_context_operations->free() take an additional > + * reference so we can just call path_put(). > + */ > + dget(upperdentry); That's a very odd context for this comment. The fact is that ovl_inode_init() takes ownership of upperdentry so it is better to clarify that ovl_get_root() takes ownership of ovl_get_root() takes ownership of upperdentry. I will post a prep patch.... [...] > @@ -1952,28 +1813,20 @@ static int ovl_fill_super(struct super_block *sb, struct fs_context *fc) > if (err) > goto out_err; > > + /* > + * Check ctx->nr instead of ofs->config.lowerdir since we're > + * going to set ofs->config.lowerdir here after we know that the > + * user specified all layers. > + */ outdated comment - removed in my branch. > err = -EINVAL; > - if (!ofs->config.lowerdir) { > + if (ctx->nr == 0) { > if (fc->sb_flags & SB_SILENT) > pr_err("missing 'lowerdir'\n"); > goto out_err; > } [...] > @@ -2085,13 +1938,10 @@ static int ovl_fill_super(struct super_block *sb, struct fs_context *fc) > sb->s_iflags |= SB_I_SKIP_SYNC; > > err = -ENOMEM; > - root_dentry = ovl_get_root(sb, upperpath.dentry, oe); > + root_dentry = ovl_get_root(sb, ctx->upper.dentry, oe); > if (!root_dentry) > goto out_free_oe; > > - mntput(upperpath.mnt); > - kfree(splitlower); > - > sb->s_root = root_dentry; > ... I will post a prep patch to replace mntput(upperpath.mnt) here with path_put(&upperpath), so you patch just moves path_put() to fs_context_operations->free(). Thanks, Amir.