On Mon, Aug 10, 2015 at 04:59:42PM +0200, Jan Kara wrote: > On Fri 07-08-15 21:55:52, Oleg Nesterov wrote: > > I'll try to re-check/re-test, but so far I think that this and the > > previous series are correct. However 3/4 from the previous series > > (attached at the end just in case) uncovers (I think) some problems > > in xfs locking. .... > > Hum, I had a look at the lockdep report below and it's one of the > peculiarities of the freeze protection. For record let me repeat the full > argument for why I don't think there's a possibility for a real deadlock. > Feel free to skip to the end if you get bored. > > One would like to construct the lock chain as: > > CPU0 (chown foo dir) CPU1 (readdir dir) CPU2 (page fault) > process Y process X, thread 0 process X, thread 1 > > get ILOCK for dir > gets freeze protection > starts transaction in xfs_setattr_nonsize > waits to get ILOCK on 'dir' > get mmap_sem for X > wait for mmap_sem for process X > in filldir() > wait for freeze protection in > xfs_page_mkwrite > > and CPU3 then being in freeze_super() blocking CPU2 and waiting for CPU0 to > finish it's freeze-protected section. But this cannot happen. The reason is > that we block writers level-by-level and thus while there are writers at > level X, we do not block writers at level X+1. So in this particular case > freeze_super() will block waiting for CPU0 to finish its freeze protected > section while CPU2 is free to continue. > > In general we have a chain like > > freeze L0 -> freeze L1 -> freeze L2 -> ILOCK -> mmap_sem --\ > A | > \------------------------------------------/ > > But since ILOCK is always acquired with freeze protection at L0 and we can > block at L1 only after there are no writers at L0, this loop can never > happen. > > Note that if we use the property of freezing that lock at level X+1 cannot > block when we hold lock at level X, we can as well simplify the dependency > graph and track in it only the lowest level of freeze lock that is > currently acquired (since the levels above it cannot block and do not in > any way influence blocking of other processes either and thus are > irrelevant for the purpose of deadlock detection). Then the dependency > graph we'd get would be: > > freeze L0 -> ILOCK -> mmap_sem -> freeze L1 > > and we have a nice acyclic graph we like to see... So probably we have to > hack the lockdep instrumentation some more and just don't tell lockdep > about freeze locks at higher levels if we already hold a lock at lower > level. Thoughts? The XFS directory ilock->filldir->might_fault locking path has been generating false positives in quite a lot of places because of things we do on one side of the mmap_sem in filesystem paths vs thigs we do on the other side of the mmap_sem in the page fault path. I'm getting sick of these stupid lockdep false positives. I think I need to rework the XFS readdir locking patch I wrote a while back: http://oss.sgi.com/archives/xfs/2014-03/msg00146.html At the time it wasn't clear what the best way forward was. Right now I think the method I originally used (IOLOCK for directory data and hence readdir, ILOCK for everything else) will be sufficient; if we need to do anything smarter that can be addressed further down the road. Cheers, Dave. -- Dave Chinner david@xxxxxxxxxxxxx -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html