On Fri, Dec 09, 2022 at 11:43:44PM -0800, Noah Misch wrote: > On Mon, Nov 28, 2022 at 06:50:59PM -0800, Darrick J. Wong wrote: > > On Sat, Nov 19, 2022 at 05:34:12PM -0800, Noah Misch wrote: > > > On Tue, Nov 15, 2022 at 07:14:47PM -0800, Darrick J. Wong wrote: > > > > On Wed, Nov 09, 2022 at 08:54:52PM -0800, Noah Misch wrote: > > > > > Subject line has my typo: s/sync_file_range/copy_file_range/ > > > > > Another dumb thing about how the pagecache tracks errors is that it sets > > > > a single state bit for the whole mapping, which means that we can't > > > > actually /tell/ userspace which part of their file is now busted. We > > > > can't even tell if userspace has successfully rewrite()d all the regions > > > > where writeback failed, which leads me to... > > > > > > > > Another another dumb thing about how the pagecache tracks errors is that > > > > any fsync-lik operation will test_and_clear_bit the EIO state, which > > > > means that if we find a past EIO, we'll clear that state and return the > > > > EIO to userspace. > > > > > > > > We /could/ change FICLONE to flush the dirty pagecache, sample the EIO > > > > status *without* clearing it, and return EIO if it's set. That's > > > > probably the most unabsurd way to deal with this, but it's unsettling > > > > that even cp ignores errno returns now. The manpage for FICLONE doesn't > > > > explicitly mention any fsync behaviors, so perhaps "flush and retain > > > > EIO" is the right choice here. > > > > > > That reminds me of > > > https://postgr.es/m//20180427222842.in2e4mibx45zdth5@xxxxxxxxxxxxxxxxx. Its > > > summary of a LSF/MM 2018 discussion mentioned NFS writeback errors detected > > > and cleared at close(), which I find similar. I might favor a uniform policy, > > > one of: > > > > > > a. Any syscall with a file descriptor argument might return EIO. If it does, > > > it clears the EIO. > > > b. Any syscall with a file descriptor argument might return EIO. Only a > > > specific list of syscalls, having writeback-oriented names, clear EIO: > > > fsync(), syncfs(), [...]. Others report EIO without clearing it. > > > > > > One argument for (b) is that, on EIO from FICLONE or copy_file_range(), the > > > caller can't know whether the broken file is the source or the destination. A > > > cautious caller should assume both are broken. What other considerations > > > should influence the decision? > > > > That's a very good point you've raised -- userspace can't associate an > > EIO return value with either of the fds in use. It can't even tell if > > the filesystem itself hit some metadata error somewhere else (e.g. > > refcount data), and that's the real reason why EIO got thrown back to > > userspace. > > > > On those grounds, I think FICLONE/FIEDEDUPE need to preserve the > > AS_EIO/AS_ENOSPC state in the address_space so that actual fsync (or > > syncfs, or any of the known 'persist me now' calls) can also return the > > status. > > > > I'll try to push that for 6.3. > > That sounds good. Thank you. Please CC me on any threads you create for > this, if not inconvenient. Will do. --D