Re: [GIT PULL] overlayfs update for 4.10

Miklos Szeredi <miklos@xxxxxxxxxx> · Sun, 11 Dec 2016 14:51:15 +0100

On Sun, Dec 11, 2016 at 3:12 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
> On Sat, Dec 10, 2016 at 09:49:26PM +0100, Miklos Szeredi wrote:
>> Hi Al,
>>
>> I usually send overlayfs pulls directly to Linus, but it it suits you, please
>> feel free to pull from:
>>
>>   git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs.git overlayfs-linus
>>
>> This update contains:
>>
>>  - try to clone on copy-up;
>>  - allow renaming a directory;
>>  - fix data inconsistency of read-only fds after copy up;
>>  - misc cleanups and fixes.
>
>         Miklos, I'm very tempted to just let Linus do the... explaining
> why "ovl: add infrastructure for intercepting file ops" is not nicely done.
> It relies upon so damn many subtle things that result is a minefield for
> any later work.  If nothing else, you've just created a magical place that
> will have to be modified every time somebody adds a method.  Moreover, ->open()
> instances have every right to expect that nothing will change ->f_op after
> they return, period.  That includes things like later comparisons of ->f_op
> with known pointers, etc.
>
> Worse, there's nothing to prohibit embedding file_operations into an object
> with lifetime shorter than that of a module.  Your approach will blow up on
> those.  Sure, at the moment all of them live on weird filesystems that will be
> (hopefully) rejected before you get to that point.  With no promise whatsoever
> that this situation will persist.
>
> overlayfs is already one hell of a special snowflake, but this is just plain
> ridiculous - that sticks its fingers into so many places that making sure they
> don't get squashed will be very hard.  IMO that kind of stuff is on the
> "this should be handled by VFS or not at all" side of things, and I'm not
> at all sure that doing that anywhere is a good idea.

Let me just argue back with what happened with f_path.  We've seen the
breakage, and still nothing guarantees that filesystems won't assume
f_path.dentry isn't theirs.   This isn't much different IMO, except I
suspect the fallout from this will be much much smaller than from the
f_path change.  Having said that, I can try fixing in the VFS but I
suspect you won't like it much better.

And I tend to agree with you about the usefulness of this whole
change.  However (intelligent) people will argue about not building on
overlayfs because it's "not a POSIX fs" having quirks like this.  So
it's really the perception that needs to be fixed, and AFAICS the only
way to fix that is to fix the quirks.

> PS: macros like
> +#define OVL_CALL_REAL_FOP(file, call) \
> +       ({ struct ovl_fops *__ofop =                                     \
> +                       container_of(file->f_op, struct ovl_fops, fops); \
> +          WARN_ON(__ofop->magic != OVL_FOPS_MAGIC) ? -EIO :             \
> +                  __ofop->orig_fops->call;                              \
> +       })
>
> with uses along the lines of
> +               return OVL_CALL_REAL_FOP(file,
> +                                        fsync(file, start, end, datasync));
> make some things (like, you know, "find all places where a method could
> be called") harder for no good reason.

Makes sense.  I can expand them inline.

>
> While we are at it,
> +               module_put(ofop->owner);
> +               fops_put(ofop->orig_fops);
> is wrong - if that was the last reference to a module, your fops_put()
> might very well try and access a vfree'd area...

Yeah the order is wrong.  Will fix.

Thanks,
Miklos
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html