Re: [RFC] The many faces of the I_LOCK

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 22 February 2007 05:29:04 +1100, David Chinner wrote:
> On Wed, Feb 21, 2007 at 01:09:56PM +0000, Jörn Engel wrote:
> > 
> > Process A:				Process B:
> > inode_wait				[filesystem locking write path]
> > __wait_on_bit				__writeback_single_inode
> > out_of_line_wait_on_bit
> > ifind_fast
> > [filesystem calling iget()]
> > [filesystem locking write path]
> 
> This is caused by your cleaner thread racing with writeback doing inode
> lookup, right? You need a non-blocking inode lookup to prevent the deadlock,
> I guess....

Almost correct.  I need an inode lookup that is not blocking on
writes.  Blocking on reads is fine - it will simply delay the cleaner
and cost some performance.

> IIRC, the checks in xfs_ichgtime[_fast] are simply an optimisation - if the
> inode is currently I_LOCKed then we can't mark it dirty anyway so we don't
> even bother trying. We do mark the XFS inode structure (*) dirty, though, so
> the modification will make it to disk at some time in the future.
> 
> (*) XFS does double inode caching because the XFS transaction subsystem
> requires inodes to have a different lifecycle to the linux struct inode
> lifecycle.

Ok.  Then my current patches would break XFS, I believe.  I'll just add
a seperate fix for now and will merge that in later.

> > Now, if the group in fs/fs-writeback.c had a completion event that is
> > independent of anything in fs/inode.c, the deadlock scenario described
> > in section 1 goes away.  As a further result, ilookup5_nowait() can get
> > removed as its only user, NTFS, only introduced it as a workaround for
> > the deadlock scenario.
> 
> Could you use ilookup5_nowait() in LogFS and avoid the cleaner deadlock
> that way?

I could, but I personally consider ilookup5_nowait() to be a nasty hack.
It admittedly does what it was supposed to do, but it merely hacks
around the issue of using the same lock for two completely independent
purposes.

Also, I _want_ to block on I_LOCK in ilookup5() and friends.  If the
inode has been allocated, but not read from disk yet, the
wait_on_inode(inode) in ifind() is the only thing protecting me from
accessing garbage data.

Thinking about it, ignoring this lock can cause data loss on the
filesystem.  When validating blocks, they are compared to an unread
inode, which just looks like an empty file.  So all blocks look invalid
and will get deleted.  That race is _really_ hard to lose, but also
_really_ nasty.

Jörn

-- 
"[One] doesn't need to know [...] how to cause a headache in order
to take an aspirin."
-- Scott Culp, Manager of the Microsoft Security Response Center, 2001
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux