Re: EIO for removed redirected files?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 13, 2020 at 8:22 PM Kevin Locke <kevin@xxxxxxxxxxxxxxx> wrote:
>
> Thanks again Amir!  I'll work on patches for the docs and adding
> pr_warn_ratelimited() for invalid metacopy/redirect as soon as I get a
> chance.
>
> On Wed, 2020-08-12 at 20:06 +0300, Amir Goldstein wrote:
> > On Wed, Aug 12, 2020 at 7:05 PM Kevin Locke <kevin@xxxxxxxxxxxxxxx> wrote:
> >> On Wed, 2020-08-12 at 18:21 +0300, Amir Goldstein wrote:
> >>> I guess the only thing we could document is that changes to underlying
> >>> layers with metacopy and redirects have undefined results.
> >>> Vivek was a proponent of making the statements about outcome of
> >>> changes to underlying layers sound more harsh.
> >>
> >> That sounds good to me.  My current use case involves offline changes to
> >> the lower layer on a routine basis, and I interpreted the current
> >
> > You are not the only one, I hear of many users that do that, but nobody ever
> > bothered to sit down and document the requirements - what exactly is the
> > use case and what is the expected outcome.
>
> I can elaborate a bit.  Keep in mind that it's a personal use case which
> is flexible, so it's probably not worth supporting specifically, but may
> be useful to discuss/consider:
>
> A few machines that I manage are dual-boot between Windows and Linux,
> with software that runs on both OSes (Steam).  This software installs a
> lot (>100GB) of semi-static data which is mostly (>90%) the same between
> OSes, but not partitioned by folder or designed to be shared between
> them.  The software includes mechanisms for validating the data files
> and automatically updating/repairing any files which do not match
> expectations.
>
> I currently mount an overlayfs of the Windows data directory on the
> Linux data directory to avoid storing multiple copies of common data.
> After any data changes in Windows, I re-run the data file validation in
> Linux to ensure the data is consistent.  I also occasionally run a
> deduplication script[1] to remove files which may have been updated on
> Linux and later updated to the same contents on Windows.
>

Nice use case.
It may be a niche use case the way to describe it, but the general concept
of "updatable software" at the lower layer is not unique to your use case.
See this [1] recent example that spawned the thread about updating the
documentation w.r.t changing underlying layers.

[1] https://lore.kernel.org/linux-unionfs/32532923.JtPX5UtSzP@fgdesktop/

> To support this use, I'm looking for a way to configure overlayfs such
> that offline changes to the lower dir do not break things in a way that
> can't be recovered by naive file content validation.  Beyond that, any
> performance-enhancing and space-saving features are great.
>
> metacopy and redirection would be nice to have, but are not essential as
> the program does not frequently move data files or modify their
> metadata.

That's what I figured.

> If accessing an invalid metacopy behaved like a 0-length
> file, it would be ideal for my use case (since it would be deleted and
> re-created by file validation) but I can understand why this would be
> undesirable for other cases and problematic to implement.  (I'm

I wouldn't say it is "problematic" to implement. It is simple to convert the
EIO to warning (with opt-in option). What would be a challenge to implement
is the behavior, where metadata access is allowed for broken metacopy,
but data access results in EIO.

> experimenting with seccomp to prevent/ignore metadata changes, since the
> program should run on filesystems which do not support them.  An option
> to ignore/reject metadata changes would be handy, but may not be
> justified.)
>
> Does that explain?  Does it seem reasonable?  Is disabling metacopy and
> redirect_dir likely to be sufficient?

Yes, disabling metacopy and redirect_dir sounds like the right thing to do,
because I don't think they gain you too much anyway.

>
> Best,
> Kevin
>
> [1]: Do you know of any overlayfs-aware deduplication programs?  If not,
> I may consider cleaning up and publishing mine at some point.

I know about overlayfs-tools's "merge" command.
I do not know if anyone is using this tool besides perhaps it's author (?).
Incidentally, I recently implemented the "deref" command for overlayfs-tools [2]
which unfolds metacopy and redirect_dir and creates an upper layer without
them. The resulting layer can then be deduped with lower layer using the
"merge" command.

[2] https://github.com/kmxz/overlayfs-tools/pull/11

I also implemented (in the same pull request) awareness of overlayfs-tools
to metacopy and redirect_dir with existing commands. "merge" command
simply aborts when they are encountered, but "vacuum" and "diff" commands
work correctly. I also added the "overlay diff -b" variant, which
creates an output
equivalent to that of the standard diff tool (diffutils) just by
analyzing the layers.

Thanks,
Amir.



[Index of Archives]     [Linux Filesystems Devel]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux