On Wed, Apr 18, 2012 at 12:44:24AM +0100, Al Viro wrote: > On Tue, Apr 17, 2012 at 03:08:26PM -0700, Linus Torvalds wrote: > > > Or I could increment that counter for all the conflicting operations and > > > rely on it instead of the i_mutex. ?I was trying to avoid adding > > > something like that (an inc, a dec, another error path) to every > > > operation. ?And hoping to avoid adding another field to struct inode. > > > Oh well. > > > > We could just say that we can do a double inode lock, but then > > standardize on the order. And the only sane order is comparing inode > > pointers, not inode numbers like ext4 apparently does. > > > > With a standard order, I don't think it would be at all wrong to just > > take the inode lock on rename. > > In principle, yes, but have you tried to grep for i_mutex? Note that > we have *another* place where multiple ->i_mutex might be held on > non-directories (and unless I'm missing something, ext4 move_extent.c > stuff doesn't play well with it): quota writes. Which can, AFAICS, > happen while write(2) is holding ->i_mutex on a regular file. So > it's not _that_ easy - we want something like "and quota file is goes > last" So the idea would be to always take the i_mutex on non-quota files before taking it on quota files? I tried pulling the ext4 thing into fs/inode.c, modifying the order to do that, and then doing the rename change on top of that. One thing I don't understand is how that interacts with quota_on/quota_off. How do we decide the right lock ordering if files can go back and forth between being quota files? --b. > , since there we don't get to change the locking order - the first > ->i_mutex is taken too far outside. > > I really don't like how messy i_mutex had become these days. Right now > I'm staring at 700-odd lines all over the place where it's taken/released > and it's a wastebucket lock - used to protect random bits and scraps, with a > lot of filesystems, etc. using it for purposes of their own ;-/ -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html