Re: races in ll_splice_alias() and elsewhere (ext4, ocfs2)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mar 9, 2016, at 10:46 PM, Oleg Drokin wrote:

> 
> On Mar 9, 2016, at 10:34 PM, Al Viro wrote:
> 
>> On Thu, Mar 10, 2016 at 03:08:57AM +0000, Drokin, Oleg wrote:
>>>> Note, BTW, that d_splice_alias() will not look for aliases in case of
>>>> non-directories - for those it's the same d_add(), since there we can
>>>> legitimately have many dentry aliases over the same inode.  For directories
>>>> we *can't*.
>>> 
>>> Ah! Btw this highlights the missed case for d_exact_alias + d_splice_alias
>>> in Lustre with current collection of patches you carry.
>>> 
>>> Suppose we have a dentry pointing to an inode all nicely covered by a lustre
>>> lock (to ensure it is valid).
>>> Now some other client does something that invalidates the name (rename,
>>> or just open(O_CREAT) even), this causes our local lock to disappear
>>> and causes the dentry to be declared "lustre invalid" via
>>> ll_md_blocking_ast()->ll_invalidate_aliases()->d_lustre_invalidate()
>>> 
>>> This sets the "invalid" flag in __d_lustre_invalidate() and
>>> also would try to unhash the dentry in some cases:
>>>       if (d_count(dentry) == 0 && !(dentry->d_flags & DCACHE_DISCONNECTED))
>>>               __d_drop(dentry);
>>> 
>>> Now if none of those conditions hit, the dentry stays hashed.
>>> But the problem is if it's not a directory, then d_splice_alias would
>>> also ignore it, while d_exact_alias would ignore it due to it being still hashed
>>> and this dentry would just hang around uselessly taking RAM.
>>> 
>>> Is there an easy way to rectify this, I wonder?
>> 
>> Wait a minute.  If it's hashed, has the right name and the right parent,
>> why the hell are we calling ->lookup() on a new dentry in the first place?
>> Why hadn't we simply picked it from dcache?
> 
> This is because of the trickery we do in the d_compare.
> our d_compare looks at the "invalid" flag and if it's set, returns "not matching",
> triggering the lookup instead of revalidate.
> This makes revalidate simple and fast.
> (We used to have a complicated revalidate with a lot of code duplication with
> lookup in order to be able to query the server and pass all sorts of data there
> and it was nothing but trouble)

Let me add that all that complicated logic was due to d_revalidate returning 0 in the
past was bound to quickly turn into ESTALE returned to userspace which was not really
acceptable.

It seems this has changed dramatically and we might not need this now if ESTALE
is never returned just because dentry happened to be invalid.
Time to run some tests, I guess.

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [Samba]     [Device Mapper]     [CEPH Development]
  Powered by Linux