Re: [PATCH] iomap: Handle I/O errors gracefully in page_mkwrite

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jun 05, 2020 at 09:30:53AM +1000, Dave Chinner wrote:
> On Thu, Jun 04, 2020 at 04:05:19PM -0700, Matthew Wilcox wrote:
> > On Fri, Jun 05, 2020 at 08:57:26AM +1000, Dave Chinner wrote:
> > > On Thu, Jun 04, 2020 at 01:23:40PM -0700, Matthew Wilcox wrote:
> > > > From: "Matthew Wilcox (Oracle)" <willy@xxxxxxxxxxxxx>
> > > > 
> > > > Test generic/019 often results in:
> > > > 
> > > > WARNING: at fs/iomap/buffered-io.c:1069 iomap_page_mkwrite_actor+0x57/0x70
> > > > 
> > > > Since this can happen due to a storage error, we should not WARN for it.
> > > > Just return -EIO, which will be converted to a SIGBUS for the hapless
> > > > task attempting to write to the page that we can't read.
> > > 
> > > Why didn't the "read" part of the fault which had the EIO error fail
> > > the page fault? i.e. why are we waiting until deep inside the write
> > > fault path to error out on a failed page read?
> > 
> > I have a hypothesis that I don't know how to verify.
> > 
> > First the task does a load from the page and we put a read-only PTE in
> > the page tables.  Then it writes to the page using write().  The page
> > gets written back, but hits an error in iomap_writepage_map()
> > which calls ClearPageUptodate().  Then the task with it mapped attempts
> > to store to it.
> 
> Sure, but that's not really what I was asking: why isn't this
> !uptodate state caught before the page fault code calls
> ->page_mkwrite? The page fault code has a reference to the page,
> after all, and in a couple of paths it even has the page locked.

If there's already a PTE present, then the page fault code doesn't
check the uptodate bit.  Here's the path I'm looking at:

do_wp_page()
 -> vm_normal_page()
 -> wp_page_shared()
     -> do_page_mkwrite()

I don't see anything in there that checked Uptodate.

> What I'm trying to understand is why this needs to be fixed inside
> ->page_mkwrite, because AFAICT if we have to fix this in the iomap
> code, we have the same "we got handed a bad page by the page fault"
> problem in every single ->page_mkwrite implementation....

I think the iomap code is the only filesystem which clears PageUptodate
on errors.  I know we've argued about whether that's appropriate or not
in the past.

> > I haven't dug through what generic/019 does, so I don't know how plausible
> > this is.
> 
> # Run fsstress and fio(dio/aio and mmap) and simulate disk failure
> # check filesystem consistency at the end.
> 
> I don't think it is mixing buffered writes and mmap writes on the
> same file via fio. Maybe fsstress is triggering that, but the
> fsstress workload is single threaded so, again, I'm not sure it will
> do this.

Maybe that's not how we end up with a read-only PTE in the process's
page tables.  Perhaps it starts out with a store, then on an fsync()
we mark it read-only, then try to do another store.



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux