Re: [patch 6/6] mm: fsync livelock avoidance

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 11 Dec 2008 23:32:13 +0100
Nick Piggin <npiggin@xxxxxxx> wrote:

> On Thu, Dec 11, 2008 at 01:51:11PM -0800, Andrew Morton wrote:
> > On Wed, 10 Dec 2008 08:42:09 +0100
> > Nick Piggin <npiggin@xxxxxxx> wrote:
> > > 
> > > This lock also solves a real data integrity problem that I only noticed as
> > > I was writing the livelock avoidance code. If we consider the lock as the
> > > solution to this bug, this makes the livelock avoidance code much more
> > > attractive because then it does not introduce the new lock.
> > > 
> > > The bug is that fsync errors do not get propogated back up to the caller
> > > properly in some cases. Consider where we write a page in the writeout path,
> > > then it encounters an IO error and finishes writeback, in the meantime, another
> > > process (eg. via sys_sync, or another fsync) clears the mapping error bits.
> > > Then our fsync will have appeared to finish successfully, but actually should
> > > have returned error.
> > 
> > Has *anybody* *ever* complained about this behaviour?  I think maybe
> > one person after sixish years?
> 
> The livelock behaviour? (or the error propagation).
> 
> I first heard about it from Mikulas, where some dm tool locks up because
> it does direct IO on the block device of mounted filesystem (or something
> like that).

Does it actually lock up?  Or does it just take a loooong time?

Presumably it can be worked around in userspace.

> That case is actually mostly solved by my first ptach in the
> series. 

mm-direct-io-starvation-improvement.patch?   I guess that would help
a lot.  I can't imagine why we didn't do that years ago???

Can we please determine whether that optimisation was sufficient
for Mikulas's example?

> > Why fix it?
> 
> Good question. My earlier patches already in your tree removed some starvation
> avoidance code because they were breaking data integrity semantics. So in
> theory, your tree today is more susceptible to this sync/fsync starvation
> than mainline. I care most about the correctness, and it would be great if
> nobody cares about this starvation problem so we don't need the extra
> complexity.

Yes, it does add quite a bit of complexity and more code.  It'd be good
if we could find some way of avoiding merging it.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux