On Wed, Aug 12, 2020 at 7:05 PM Kevin Locke <kevin@xxxxxxxxxxxxxxx> wrote: > > Thanks for the quick response Amir! > > On Wed, 2020-08-12 at 18:21 +0300, Amir Goldstein wrote: > > On Wed, Aug 12, 2020 at 5:06 PM Kevin Locke <kevin@xxxxxxxxxxxxxxx> wrote: > >> I recently encountered files on an overlayfs which returned EIO > >> (Input/output error) for open, stat, and unlink (and presumably other) > >> syscalls. I eventually determined that the files had been redirected > > > > It's *empty* redirected files that cause the alleged problem. > > When I replace `touch foo.txt` with `echo 123 > foo.txt` I observe the > same behavior. If I understand you correctly, you are saying that EIO > is correct for non-empty files, but potentially incorrect for empty > files (which could be copied rather than redirected, since there is no > space saving)? > I wouldn't call it "incorrect" more like "unnecessary". > >> At this point, the only way to recover appears to be unmounting the > >> overlay and removing the file from upper (or updating the > >> overlay.redirect xattr to a valid location). Is that correct? > >> > >> Is this the intended behavior? > > > > Yes. > > What would you expect to happen when data of metacopy file has been removed? > > After reflection, EIO probably makes the most sense for open/stat. It > might be nice to be able to unlink the file to allow recovery (in the > sense of being able to reuse the name) without unmounting the overlay, It would be nice, but somebody needs to care enough to implement it and it is not going to be trivial, because error on lookup is much easier then selective error on a "broken" dentry depending on the operation... > but the documentation updates may be sufficient to keep users from > getting into this state. > > >> unionmount-testsuite. If so, perhaps the behavior could be noted in > >> "Changes to underlying filesystems" in > >> Documentation/filesystems/overlayfs.rst? I'd be willing to write a > >> first draft. (I doubt I understand it well enough to get everything > >> right on the first try.) > > > > I guess the only thing we could document is that changes to underlying > > layers with metacopy and redirects have undefined results. > > Vivek was a proponent of making the statements about outcome of > > changes to underlying layers sound more harsh. > > That sounds good to me. My current use case involves offline changes to > the lower layer on a routine basis, and I interpreted the current You are not the only one, I hear of many users that do that, but nobody ever bothered to sit down and document the requirements - what exactly is the use case and what is the expected outcome. > wording "Offline changes, when the overlay is not mounted, are allowed > to either the upper or the lower trees." to mean that such offline > modifications would not break things in unexpected ways. > The truth is that this documentation is old, before all the new features were added. See here [1], Vivek suggested: "Modifying/recreating lower layer only works when metacopy/index/nfs_export are not enabled at any point of time. This also will change inode number reporting behavior." > In retrospect, I should have expected this behavior, but as someone > previously unfamiliar with overlayfs, I hadn't considered that metacopy > results in file redirects and that if the underlying file were removed > without removing any redirects pointing to it that it would manifest in > this way and be so difficult to clean up. > > If metacopy and dir_redirect are disabled, are offline modifications to > the lower layer permitted, or could any such modification result in > undefined behavior? > With metacopy/index/nfs_export/redirect_dir disabled code behaves mostly like it did at the time that this documentation was written, so I guess you may say that changes are permitted and result in "defined" behavior. > >> Also, if there is any way this could be made easier to debug, it might > >> be helpful for users, particularly newbies like myself. Perhaps logging > >> bad redirects at KERN_ERR? If that would be too noisy, perhaps only > >> behind a debug module option? Again, if this is acceptable I'd be > >> willing to draft a patch. (Same caveat as above.) > > > > There are a handful of places in overlayfs where returning EIO is > > combined with informative pr_warn_ratelimited(). > > Ah, indeed. Would doing so for missing/invalid metacopy/redirect make > sense? > It seems fine to me. Thanks, Amir. [1] https://lore.kernel.org/linux-unionfs/20200708173601.GD103536@xxxxxxxxxx/T/#m75db6db2fc8d9739ce6fed9dfebe81bb1dd53bf9