Re: [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Sage Weil wrote:
> On Thu, 30 Jul 2009, Ian Kent wrote:
>> Sage Weil wrote:
>>> On Wed, 29 Jul 2009, Ian Kent wrote:
>>>>> How is it possible for a revalidate to fail and then to skip over the 
>>>>> ->lookup() call?  That sounds like a problem with the VFS?  (Like the one 
>>>>> I'm trying to fix... you're testing with the real_lookup patch applied, I 
>>>>> hope?)
>>>> It's been hard to work out exactly what's happening but I suspect it is
>>>> due to multiple walks occurring while the the dentry hashed state
>>>> changes. Like, one process hits revalidate, drops the dentry, another
>>>> process comes along and goes straight to lookup and rehashes the dentry,
>>>> the original process, that dropped the dentry, then ends up not calling
>>>> lookup. The race only happens occasionally, and my test uses 10
>>>> processes that tend to hit the same dentrys at the same time at various
>>>> points in the mount tree over 150 iterations. Admittedly, it's a fairly
>>>> stressful test as autofs goes but it does tend to show up races quite well.
>>> If I'm reading namei.c correctly, every ->d_revalidate() that returns NULL 
>>> will always result in a ->lookup() (unless IS_DEADDIR(inode)), without any 
>>> regard for the dentry's hashed/unhashed status.  The exception appears to 
>>> be a call in __link_path_walk when FS_REVAL_DOT is set in the superblock 
>>> (nfs only):
>>>
>>> 		if (nd->path.dentry && nd->path.dentry->d_sb &&
>>> 		    (nd->path.dentry->d_sb->s_type->fs_flags & FS_REVAL_DOT)) {
>>> 			err = -ESTALE;
>>> 			/* Note: we do not d_invalidate() */
>>> 			if (!nd->path.dentry->d_op->d_revalidate(
>>> 					nd->path.dentry, nd))
>>> 				break;
>>>
>>> in which case userspace would see an ESTALE.
>>>
>>> It sounds like a few well placed printks that include the pid for unhash, 
>>> rehash, revalidate, and lookup would narrow down what is going on?
>> Of course I've been doing that.
>>
>> I think you need to re-read what I wrote above. Your analysis is fine if
>> there is only one process concurrently walking onto the dentry. If a
>> second process beats the revalidate process to real_lookup() and
>> incorrectly rehashes the dentry the following revalidate process will
>> not call ->lookup().
> 
> Sorry, you're right.  I missed the d_lookup call in real_lookup.  So you 
> mean something like:
> 
> do_lookup
> 	__d_lookup -> finds dentry
> 	do_revalidate -> drops dentry, returns null
> 	real_lookup
> 		take i_mutex
> 		d_lookup -> returns dentry rehashed by racing ->lookup()

Well, that's what I thought but looking again it doesn't look like that
can happen. However, the evidence I have speaks for itself, a user space
process is calling ->open() on a mount point with no mount. That should
never happen, it should have waited for the mount. And the code to
ensure that the process that dropped the dentry is the one that
re-hashes it makes the problem go away.

> 
>> My most recent try at this seems to be working.
>> I need to clean up the printks and alter one of the comments and re-test
>> with a full test (I have been working with only half the test enabled as
>> it was most problematic). The testing takes a long time to run, 1 hour
>> 10 minutes for each of two configurations and the criteria for success
>> in at least six consecutive successful runs, that's about thirteen
>> hours. But, hopefully, when that is done I'll have a series to post for
>> comment.
> 
> Great to hear.  I hope my inane blathering hasn't been totally useless! :)

Sadly, on the seventh run of my test I encountered what looks like a
deadlock. I'm working on that now but, but as I progress the testing
takes longer and longer, *sigh*.

Ian
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux