On Tue, 2018-09-25 at 00:30 +0100, Alan Cox wrote: > > write() > > kernel attempts to write back page and fails > > page is marked clean and evicted from the cache > > read() > > > > Now your write is gone and there were no calls between the write and > > read. > > > > The question we still need to answer is this: > > > > When we attempt to write back some data from the cache and that fails, > > what should happen to the dirty pages? > > Why do you care about the content of the pages at that point. The only > options are to use the data (todays model), or to report that you are on > fire. > The data itself doesn't matter much. What does matter is consistent behavior in the face of such an error. The issue (IMO) is that currently, the result of a read that takes place after a write but before an fsync is indeterminate. If writeback succeeded (or hasn't been done yet) you'll get back the data you wrote, but if there was a writeback error you may or may not. The behavior in that case mostly depends on the whim of the filesystem developer, and they all behave somewhat differently. > If you are going to error you don't need to use the data so you could in > fact compress dramatically the amount of stuff you need to save > somewhere. You need the page information so you can realize what page > this is, but you can point the data into oblivion somewhere because you > are no longer going to give it to anyone (assuming you can successfully > force unmap it from everyone once it's not locked by a DMA or similar). > > In the real world though it's fairly unusual to just lose a bit of I/O. > Flash devices in particular have a nasty tendancy to simply go *poof* and > the first you know about an I/O error is the last data the drive ever > gives you short of jtag. NFS is an exception and NFS soft timeouts are > nasty. > Linux has dozens of filesystems and they all behave differently in this regard. A catastrophic failure (paradoxically) makes things simpler for the fs developer, but even on local filesystems isolated errors can occur. It's also not just NFS -- what mostly started me down this road was working on ENOSPC handling for CephFS. I think it'd be good to at least establish a "gold standard" for what filesystems ought to do in this situation. We might not be able to achieve that in all cases, but we could then document the exceptions. -- Jeff Layton <jlayton@xxxxxxxxxx>