Re: [PATCH 1/2] vfs: make real_lookup do dentry revalidation with i_mutex held

Ian Kent <raven@xxxxxxxxxx> · Wed, 29 Jul 2009 10:59:33 +0800

Sage Weil wrote:
> On Mon, 20 Jul 2009, Ian Kent wrote:
>> Sage Weil wrote:
>>> On Wed, 24 Jun 2009, Ian Kent wrote:
>>>> I'm continuing with this now, but there's a deadlock in there somewhere!
>>> Sorry, are you still working with the patch you posted a few months back?
>>>
>>> 	http://marc.info/?l=linux-fsdevel&m=123831685111213&w=2
>>>
>>> Looking over it, the 
>>>
>>> +		unsigned int lock_held = mutex_is_locked(&dir->i_mutex);
>>> ...
>>> +		if (lock_held) {
>>> +			/* Already pending, send to ->lookup() */
>>> +			d_drop(dentry);
>>>
>>> bit looks highly suspect.  I'm guessing revalidate should never sleep, and 
>>> always kick things off to ->lookup() (to do any waiting on upcall 
>>> completion or whatever else) if the dentry isn't valid now...?
>> I tried your suggestion and have finally come to the conclusion that it
>> cannot work. My own fault really, for not fully understanding why I used
>> the above approach in the first place.
>>
>> I believe that if the mutex is not held then I "must" handle it in the
>> revalidate routine and if the mutex is held I "must" defer to
>> ->lookup(). The only way to send this to ->lookup() is to drop the
>> dentry and rehash it in lookup and the mutex must be held over both
>> calls or it is possible for an execution path to skip over the lookup
>> call when several concurrent processes walk into the same dentry at the
>> same time. AFAICS it isn't possible to detect this and work around it
>> when sending everything to ->lookup()
> 
> How is it possible for a revalidate to fail and then to skip over the 
> ->lookup() call?  That sounds like a problem with the VFS?  (Like the one 
> I'm trying to fix... you're testing with the real_lookup patch applied, I 
> hope?)

It's been hard to work out exactly what's happening but I suspect it is
due to multiple walks occurring while the the dentry hashed state
changes. Like, one process hits revalidate, drops the dentry, another
process comes along and goes straight to lookup and rehashes the dentry,
the original process, that dropped the dentry, then ends up not calling
lookup. The race only happens occasionally, and my test uses 10
processes that tend to hit the same dentrys at the same time at various
points in the mount tree over 150 iterations. Admittedly, it's a fairly
stressful test as autofs goes but it does tend to show up races quite well.

I'm now working on a method to ensure the code path that unhashed the
dentry is the one that rehashes it but that has its own set of
difficulties, partly because walks from the daemon shouldn't block and
the dentry can become unhashed at any time after other code paths have
dropped and rehashed the dentry, since the mutex is not always held
during revalidate.

> 
>> This digression was quite costly in time for me but useful in improving
>> my understanding of the problem. I'm going to return to my original
>> approach, hopefully I will make better progress.
> 
> I'm sorry if I've sent you on a wild goose chase.  If you describe how 
> you're reproducing the problem I can try it locally.

I thought that at the time but it turns out that you didn't, the race is
still present either way. I've since gone back to this approach.

I always knew this was going to be difficult but I didn't expect it to
not be possible, given that I'm using your VFS locking change. If this
continues to be a problem I may need to start thinking about some small
VFS changes to help work around the difficulty.

Ian

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html