Re: [PATCH v2 2/4] overlay: Document current outstanding shortcoming of volatile

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 27, 2020 at 02:52:52PM +0200, Amir Goldstein wrote:
> On Fri, Nov 27, 2020 at 11:21 AM Sargun Dhillon <sargun@xxxxxxxxx> wrote:
> >
> > This documents behaviour that was discussed in a thread about the volatile
> > feature. Specifically, how failures can go hidden from asynchronous writes
> > (such as from mmap, or writes that are not immediately flushed to the
> > filesystem). Although we pass through calls like msync, fallocate, and
> > write, and will still return errors on those, it doesn't guarantee all
> > kinds of errors will happen at those times, and thus may hide errors.
> >
> > In the future, we can add error checking to all interactions with the
> > upperdir, and pass through errseq_t from the upperdir on mappings,
> > and other interactions with the filesystem[1].
> >
> > [1]: https://lore.kernel.org/linux-unionfs/20201116045758.21774-1-sargun@xxxxxxxxx/T/#m7d501f375e031056efad626e471a1392dd3aad33
> >
> > Signed-off-by: Sargun Dhillon <sargun@xxxxxxxxx>
> > Cc: linux-fsdevel@xxxxxxxxxxxxxxx
> > Cc: linux-unionfs@xxxxxxxxxxxxxxx
> > Cc: Miklos Szeredi <miklos@xxxxxxxxxx>
> > Cc: Amir Goldstein <amir73il@xxxxxxxxx>
> > Cc: Vivek Goyal <vgoyal@xxxxxxxxxx>
> > ---
> >  Documentation/filesystems/overlayfs.rst | 6 +++++-
> >  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/Documentation/filesystems/overlayfs.rst b/Documentation/filesystems/overlayfs.rst
> > index 580ab9a0fe31..c6e30c1bc2f2 100644
> > --- a/Documentation/filesystems/overlayfs.rst
> > +++ b/Documentation/filesystems/overlayfs.rst
> > @@ -570,7 +570,11 @@ Volatile mount
> >  This is enabled with the "volatile" mount option.  Volatile mounts are not
> >  guaranteed to survive a crash.  It is strongly recommended that volatile
> >  mounts are only used if data written to the overlay can be recreated
> > -without significant effort.
> > +without significant effort.  In addition to this, the sync family of syscalls
> > +are not sufficient to determine whether a write failed as sync calls are
> > +omitted.  For this reason, it is important that the filesystem used by the
> > +upperdir handles failure in a fashion that's suitable for the user.  For
> > +example, upon detecting a fault, ext4 can be configured to panic.
> >
> 
> Reading this now, I think I may have wrongly analysed the issue.
> Specifically, when I wrote that the very minimum is to document the
> issue, it was under the assumption that a proper fix is hard.
> I think I was wrong and that the very minimum is to check for errseq
> since mount on the fsync and syncfs calls.
> 
> Why? first of all because it is very very simple and goes a long way to
> fix the broken contract with applications, not the contract about durability
> obviously, but the contract about write+fsync+read expects to find the written
> data (during the same mount era).
> 
> Second, because the sentence you added above is hard for users to understand
> and out of context. If we properly handle the writeback error in fsync/syncfs,
> then this sentence becomes irrelevant.
> The fact that ext4 can lose data if application did not fsync after
> write is true
> for volatile as well as non-volatile and it is therefore not relevant
> in the context
> of overlayfs documentation at all.
> 
> Am I wrong saying that it is very very simple to fix?
> Would you mind making that fix at the bottom of the patch series, so it can
> be easily applied to stable kernels?
> 
> Thanks,
> Amir.

I'm not sure it's so easy. In VFS, there are places where the superblock's 
errseq is checked[1]. AFAIK, that's going to check "our" errseq, and not the 
errseq of the real corresponding real file's superblock. One solution might be 
as part of all these callbacks we set our errseq to the errseq of the filesystem 
that the upperdir, and then rely on VFS's checking.

I'm having a hard time figuring out how to deal with the per-mapping based
error tracking. It seems like this infrastructure was only partially completed
by Jeff Layton[2]. I don't now how it's actually supposed to work right now,
as not all of his patches landed.

How about I split this into two patchsets? One, where I add the logic to copy
the errseq on callbacks to fsync from the upperdir to the ovl fs superblock,
and thus allowing VFS to bubble up errors, and the documentation. We can CC
stable on those because I think it has an effect that's universal across
all filesystems.

P.S. 
I notice you maintain overlay tests outside of the kernel. Unfortunately, I 
think for this kind of test, it requires in kernel code to artificially bump the 
writeback error count on the upperdir, or it requires the failure injection 
infrastructure. 

Simulating this behaviour is non-trivial without in-kernel support:

P1: Open(f) -> p1.fd
P2: Open(f) -> p2.fd
P1: syncfs(p1.fd) -> errrno
P2: syncfs(p1.fd) -> 0 # This should return an error


[1]: https://elixir.bootlin.com/linux/latest/source/fs/sync.c#L175
[2]: https://lwn.net/Articles/722250/



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux