From: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> Introduce a dentry revalidation helper to check the negative dentries of case-insensitive filesystems. This helper is based on the fact that a negative dentry might safe to be reused on a casefolded directory if it was created during a case-insensitive lookup, because that kind of lookup verifies not only the exact name doesn't exist in a directory, but also that *any* case-equivalent name also doesn't exist. The sole exception is during file creation, in which case we also need to make sure the name matches case-sensitively, in order to assure the disk name-preserving semantics. We cover most creations by checking LOOKUP_CREATE|LOOKUP_RENAME_TARGET flags. But, while most creations use those flags, there are filesystem helpers that call lookup for creation with flags==0. Since we can't know whether those are for creation, just reject the negative dentries if there are no flags to check. Note that we avoid taking the ->d_lock while accessing ->d_name, because it isn't really necessary for the LOOKUP_CREATE/LOOKUP_RENAME_TARGET case. That is because in every creation path with these flags, we know the parent inode lock is acquired, at least for reading, thus stabilizing the d_name, since it prevents the dentry from being instantiated and negative dentries cannot be moved. See also the comment in the code. * Discussion on the ->d_name stability d_revalidate can only be reached from 4 code paths: lookup_dcache, __lookup_slow, lookup_open and lookup_fast: - lookup_dcache only reaches d_revalidate with creation flags when coming from __lookup_hash, which needs the parent locked already. - In __lookup_slow, either the parent inode is read-locked by the caller (lookup_slow), or it is called with no flags (lookup_one*). A read lock suffices to prevent concurrent ->d_name modifications, with the exception of a modification inside __d_unalias, which is not a problem because negative dentries are not allowed to be moved with __d_move. In addition, d_instantiate shouldn't race with this case because its callers also acquire the parent inode lock, preventing it from racing with lookup creation. - lookup_open also requires the parent to be locked in the creation case, which is done in open_last_lookups. - lookup_fast will indeed be called with the parent unlocked, but it shouldn never be called with LOOKUP_CREATE. Either it is called in the link_path_walk, where nd->flags doesn't have LOOKUP_CREATE yet or in open_last_lookups. But, in this case, it also never has LOOKUP_CREATE, because it is only called on the !O_CREAT case, which means op->intent doesn't have LOOKUP_CREAT (set in build_open_flags only if O_CREAT is set). In addition, for the LOOKUP_RENAME_TARGET, we are doing a rename, so the parents inodes are also locked. Reviewed-by: Theodore Ts'o <tytso@xxxxxxx> Signed-off-by: Gabriel Krisman Bertazi <krisman@xxxxxxxxxxxxx> --- Changes since v5: - Use IS_CASEFOLDED directly (Eric) - Reword commit message and comment in the code (Eric) Changes since v4: - Drop useless inline declaration (eric) - Refactor to drop extra identation (Christian) - Discuss d_instantiate Changes since v3: - Add comment regarding creation (Eric) - Reorder checks to clarify !flags meaning (Eric) - Add commit message explanaton of the inode read lock wrt. __d_move. (Eric) Changes since v2: - Add comments to all rejection cases (Eric) - safeguard against filesystem creating dentries without LOOKUP flags --- fs/libfs.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/fs/libfs.c b/fs/libfs.c index 5b851315eeed..26bf1b832b0a 100644 --- a/fs/libfs.c +++ b/fs/libfs.c @@ -1462,9 +1462,63 @@ static int generic_ci_d_hash(const struct dentry *dentry, struct qstr *str) return 0; } +static int generic_ci_d_revalidate(struct dentry *dentry, + const struct qstr *name, + unsigned int flags) +{ + const struct dentry *parent; + const struct inode *dir; + + if (!d_is_negative(dentry)) + return 1; + + parent = READ_ONCE(dentry->d_parent); + dir = READ_ONCE(parent->d_inode); + + if (!dir || !IS_CASEFOLDED(dir)) + return 1; + + /* + * Negative dentries created prior to turning the directory + * case-insensitive cannot be trusted, since they don't ensure + * any possible case version of the filename doesn't exist. + */ + if (!d_is_casefolded_name(dentry)) + return 0; + + /* + * If the lookup is for creation, then a negative dentry can only be + * reused if it's a case-sensitive match, not just a case-insensitive + * one. This is needed to make the new file be created with the name + * the user specified, preserving case. + * + * LOOKUP_CREATE or LOOKUP_RENAME_TARGET cover most creations. In these + * cases, ->d_name is stable and can be compared to 'name' without + * taking ->d_lock because the caller must hold dir->i_rwsem. (This + * is because the directory lock blocks the dentry from being + * concurrently instantiated, and negative dentries are never moved.) + * + * All other creations actually use flags==0. These come from the edge + * case of filesystems calling functions like lookup_one() that do a + * lookup without setting the lookup flags at all. Such lookups might + * or might not be for creation, and if not don't guarantee stable + * ->d_name. Therefore, invalidate all negative dentries when flags==0. + */ + if (flags & (LOOKUP_CREATE | LOOKUP_RENAME_TARGET)) { + if (dentry->d_name.len != name->len || + memcmp(dentry->d_name.name, name->name, name->len)) + return 0; + } else if (!flags) { + return 0; + } + + return 1; +} + static const struct dentry_operations generic_ci_dentry_ops = { .d_hash = generic_ci_d_hash, .d_compare = generic_ci_d_compare, + .d_revalidate = generic_ci_d_revalidate, }; #endif -- 2.41.0