Re: [Lsf-pc] [LSF/MM TOPIC] I/O error handling and fsync()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2017-01-24 at 11:16 +1100, NeilBrown wrote:
> On Mon, Jan 23 2017, Trond Myklebust wrote:
> 
> > On Mon, 2017-01-23 at 17:35 -0500, Jeff Layton wrote:
> > > On Mon, 2017-01-23 at 11:09 +0100, Kevin Wolf wrote:
> > > > 
> > > > However, if we look at the greater problem of hanging requests that
> > > > came
> > > > up in the more recent emails of this thread, it is only moved
> > > > rather
> > > > than solved. Chances are that already write() would hang now
> > > > instead of
> > > > only fsync(), but we still have a hard time dealing with this.
> > > > 
> > > 
> > > Well, it _is_ better with O_DIRECT as you can usually at least break
> > > out
> > > of the I/O with SIGKILL.
> > > 
> > > When I last looked at this, the problem with buffered I/O was that
> > > you
> > > often end up waiting on page bits to clear (usually PG_writeback or
> > > PG_dirty), in non-killable sleeps for the most part.
> > > 
> > > Maybe the fix here is as simple as changing that?
> > 
> > At the risk of kicking off another O_PONIES discussion: Add an
> > open(O_TIMEOUT) flag that would let the kernel know that the
> > application is prepared to handle timeouts from operations such as
> > read(), write() and fsync(), then add an ioctl() or syscall to allow
> > said application to set the timeout value.
> 
> I was thinking on very similar lines, though I'd use 'fcntl()' if
> possible because it would be a per-"file description" option.
> This would be a function of the page cache, and a filesystem wouldn't
> need to know about it at all.  Once enable, 'read', 'write', or 'fsync'
> would return EWOULDBLOCK rather than waiting indefinitely.
> It might be nice if 'select' could then be used on page-cache file
> descriptors, but I think that is much harder.  Support O_TIMEOUT would
> be a practical first step - if someone agreed to actually try to use it.
> 

Yeah, that does seem like it might be worth exploring. 

That said, I think there's something even simpler we can do to make
things better for a lot of cases, and it may even help pave the way for
the proposal above.

Looking closer and remembering more, I think the main problem area when
the pages are stuck in writeback is the wait_on_page_writeback call in
places like wait_for_stable_page and __filemap_fdatawait_range.

That uses an uninterruptible sleep and it's common to see applications
stuck there in these situations. They're unkillable too so your only
recourse is to hard reset the box when you can't reestablish
connectivity.

I think it might be good to consider making some of those sleeps
TASK_KILLABLE. For instance, both of the above callers of those
functions are int return functions. It may be possible to return
ERESTARTSYS when the task catches a signal.

-- 
Jeff Layton <jlayton@xxxxxxxxxxxxxxx>
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux