On Fri, Jun 05, 2020 at 08:57:26AM +1000, Dave Chinner wrote: > On Thu, Jun 04, 2020 at 01:23:40PM -0700, Matthew Wilcox wrote: > > From: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx> > > > > Test generic/019 often results in: > > > > WARNING: at fs/iomap/buffered-io.c:1069 iomap_page_mkwrite_actor+0x57/0x70 > > > > Since this can happen due to a storage error, we should not WARN for it. > > Just return -EIO, which will be converted to a SIGBUS for the hapless > > task attempting to write to the page that we can't read. > > Why didn't the "read" part of the fault which had the EIO error fail > the page fault? i.e. why are we waiting until deep inside the write > fault path to error out on a failed page read? I have a hypothesis that I don't know how to verify. First the task does a load from the page and we put a read-only PTE in the page tables. Then it writes to the page using write(). The page gets written back, but hits an error in iomap_writepage_map() which calls ClearPageUptodate(). Then the task with it mapped attempts to store to it. I haven't dug through what generic/019 does, so I don't know how plausible this is.