Eric Biggers <ebiggers@xxxxxxxxxx> writes: > On Tue, Mar 14, 2023 at 10:15:11AM +0000, Luís Henriques wrote: >> Eric Biggers <ebiggers@xxxxxxxxxx> writes: >> >> > On Mon, Mar 13, 2023 at 12:33:09PM +0000, Luís Henriques wrote: >> >> + * The regular open path will use fscrypt_file_open for that, but in the >> >> + * atomic open a different approach is required. >> > >> > This should actually be fscrypt_prepare_lookup, not fscrypt_file_open, right? >> >> Ups, I missed this comment. >> >> I was comparing the regular open() with the atomic_open() paths. I think >> I really mean fscrypt_file_open() because that's where the encryption info >> is (or may be) set by calling fscrypt_require_key(). atomic_open needs >> something similar, but combined with a lookup. >> >> Maybe I can rephrase it to: >> >> The reason for getting the encryption info before checking if the >> directory has the encryption key is because the key may be available but >> the encryption info isn't yet set (maybe due to a drop_caches). The >> regular open path will call fscrypt_file_open which uses function >> fscrypt_require_key for setting the encryption info if needed. The >> atomic open needs to do something similar. >> > > No, regular open is two parts: ->lookup and ->open. fscrypt_prepare_lookup() > sets up the directory's key, whereas fscrypt_file_open() sets up the file's key. > > Your proposed fscrypt_prepare_atomic_open() sets up the directory's key. So it > is really fscrypt_prepare_lookup() that is its equivalent. Oh, I see what you mean now, and you're obviously correct. Thanks for the detailed explanation. > However, that raises the question of why doesn't ceph just use > fscrypt_prepare_lookup()? It seems the answer is that ceph wants to handle the > filenames encryption and no-key name encoding itself. And for that reason, its > ->lookup() does the following and does *not* use fscrypt_prepare_lookup(): > > if (IS_ENCRYPTED(dir)) { > err = ceph_fscrypt_prepare_readdir(dir); > if (err < 0) > return ERR_PTR(err); > if (!fscrypt_has_encryption_key(dir)) { > spin_lock(&dentry->d_lock); > dentry->d_flags |= DCACHE_NOKEY_NAME; > spin_unlock(&dentry->d_lock); > } > } Ugh, I tend to forget all the details behind these decisions. If I remember correctly, we had to work around the fact that the cephfs client handle directory data in a cumbersome way. We may not have the full data for a readdir, for example, and that has to be handled by a lookup. > So, actually I think this patch doesn't make sense. If ceph is doing the above > in its ->lookup() anyway, then it just should do the exact same thing in its > ->atomic_open() too. In fact, my initial fix for the cephfs bug was doing just that. It was a single patch to ceph_atomic_open() that would simply do: if (IS_ENCRYPTED(dir)) { set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); err = __fscrypt_prepare_readdir(dir); if (!err && !fscrypt_has_encryption_key(dir)) { spin_lock(&dentry->d_lock); dentry->d_flags |= DCACHE_NOKEY_NAME; spin_unlock(&dentry->d_lock); } } What made me want to create a new helper was that I simply needed to call fscrypt_get_encryption_info() to force the encryption info to be set in the parent directory. But this function was only accessible through __fscrypt_prepare_readdir(), which isn't really a great function name for what I need here. Since __fscrypt_prepare_readdir() doesn't seem to be used anywhere else, maybe it could be removed and fscrypt_get_encryption_info() be exported instead? > If you want to add a new fscrypt_* helper function which *just* sets up the > given directory's key and sets the NOKEY_NAME flag on the given dentry > accordingly, that could make sense. However, it should be called from *both* > ->lookup() and ->atomic_open(), not just ->atomic_open(). > > It's also worth mentioning that setting up the filename separately from the > NOKEY_NAME flag makes ceph have the same race condition that I had fixed for the > other filesystems in commit b01531db6cec ("fscrypt: fix race where ->lookup() > marks plaintext dentry as ciphertext"). It's not a huge deal, but it can cause > some odd behavior, so it's worth thinking about whether it can be solved. Hmm... OK, looks like we'll need to have a look into this. Thanks for the heads-up. Cheers, -- Luís