Ian Kent wrote: > Sage Weil wrote: >> On Tue, 24 Mar 2009, Ian Kent wrote: >>> Ian Kent wrote: >>>> Sage Weil wrote: >>>>> real_lookup() is called by do_lookup() if dentry revalidation fails. If >>>>> the cache is re-populated while waiting for i_mutex, it may find that >>>>> a d_lookup() subsequently succeeds (see the "Uhhuh! Nasty case" comment). >>>>> >>>>> Previously, real_lookup() would drop i_mutex and do_revalidate() again. If >>>>> revalidate failed _again_, however, it would give up with -ENOENT. The >>>>> problem here that network file systems may be invalidating dentries via >>>>> server callbacks, e.g. due to concurrent access from another client, and >>>>> -ENOENT is frequently the wrong answer. >>>> This will be something of a problem for autofs4 (and autofs). >>>> It would require fairly significant changes to the revalidate code. >>> Or maybe not that significant, I'm not sure yet. >> Can you be more specific about your concern? Since the locking rules >> aren't really changing (see below), the main thing to worry about is the >> loss of the wonky -ENOENT error.. > > Historically, the lock inconsistency has been around for a long time and > autofs(4) has worked around that issue. The situation is that when > autofs(4) needs to send a request back to user space it must be sent > without the mutex held. This callback can be needed in the lookup or in > revalidate, depending on the situation. Currently when the callback is > needed in revalidate, the lock is never held. > > If the lock is now held through revalidate then I will need to move the > lock release/re-acquire from the lookup to the revalidate. My initial > concern was that if we need to release and re-aquire the lock more than > once in the revalidate code the complexity of potential deadlock goes up > considerably. Having put so much effort into dealing with deadlock > issues in lookup with just one release and re-acquire I'm worried I'll > end up with more problems. > > However, it looks like I will only need to do the release/re-acquire > once in revalidate, so maybe the status-quo will be maintained and no > additional issues will surface. > It's just occurred to me that your proposing to hold the mutex over one of the two calls that can occur as a result of do_lookup(). That's not good because then we possibly have mixed locking for the same VFS operation. Last time I tried to deal with this I got into a "am I the lock owner, can I release it" dilemma. So, I guess I'm saying that, to do this we would need to hold the lock for both the calls to revalidate. But if revalidate returns an error code then it will be returned and so __link_path_walk() will then return that. At least that is what I intended when I did it, but maybe I didn't get it quite right for everyone, for example there would be no call to d_invalidate in that case. Can you perhaps use that mechanism? Ian -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html