Re: [PATCH v16 25/68] ceph: make d_revalidate call fscrypt revalidator for encrypted dentries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 09/03/2023 17:55, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:

On 09/03/2023 01:14, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:

On 08/03/2023 17:29, Luís Henriques wrote:
Xiubo Li <xiubli@xxxxxxxxxx> writes:

On 08/03/2023 02:53, Luís Henriques wrote:
xiubli@xxxxxxxxxx writes:

From: Jeff Layton <jlayton@xxxxxxxxxx>

If we have a dentry which represents a no-key name, then we need to test
whether the parent directory's encryption key has since been added.  Do
that before we test anything else about the dentry.

Reviewed-by: Xiubo Li <xiubli@xxxxxxxxxx>
Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
---
     fs/ceph/dir.c | 8 ++++++--
     1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
index d3c2853bb0f1..5ead9f59e693 100644
--- a/fs/ceph/dir.c
+++ b/fs/ceph/dir.c
@@ -1770,6 +1770,10 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
     	struct inode *dir, *inode;
     	struct ceph_mds_client *mdsc;
     +	valid = fscrypt_d_revalidate(dentry, flags);
+	if (valid <= 0)
+		return valid;
+
This patch has confused me in the past, and today I found myself
scratching my head again looking at it.

So, I've started seeing generic/123 test failing when running it with
test_dummy_encryption.  I was almost sure that this test used to run fine
before, but I couldn't find any evidence (somehow I lost my old testing
logs...).

Anyway, the test is quite simple:

1. Creates a directory with write permissions for root only
2. Writes into a file in that directory
3. Uses 'su' to try to modify that file as a different user, and
       gets -EPERM

All these steps run fine, and the test should pass.  *However*, in the
test cleanup function, a simple 'rm -rf <dir>' will fail with -ENOTEMPTY.
'strace' shows that calling unlinkat() to remove the file got a '-ENOENT'
and then -ENOTEMPTY for the directory.

Some digging allowed me to figure out that running commands with 'su' will
drop caches (I see 'su (874): drop_caches: 2' in the log).  And this is
how I ended up looking at this patch.  fscrypt_d_revalidate() will return
'0' if the parent directory does has a key (fscrypt_has_encryption_key()).
Can we really say here that the dentry is *not* valid in that case?  Or
should that '<= 0' be a '< 0'?

(But again, this patch has confused me before...)
Luis,

Could you reproduce it with the latest testing branch ?
Yes, I'm seeing this with the latest code.
Okay. That's odd.

BTW, are you using the non-root user to run the test ?

Locally I am using the root user and still couldn't reproduce it.
Yes, I'm running the tests as root but I've also 'fsgqa' user in the
system (which is used by this test.  Anyway, for reference, here's what
I'm using in my fstests configuration:

TEST_FS_MOUNT_OPTS="-o name=admin,secret=<key>,copyfrom,ms_mode=crc,test_dummy_encryption"
MOUNT_OPTIONS="-o name=admin,secret=<key>,copyfrom,ms_mode=crc,test_dummy_encryption"

I never seen the generic/123 failure yet. And just now I ran the test for many
times locally it worked fine.
That's odd.  With 'test_dummy_encryption' mount option I can reproduce it
every time.

   From the generic/123 test code it will never touch the key while testing, that
means the dentries under the test dir will always have the keyed name. And then
the 'fscrypt_d_revalidate()' should return 1 always.

Only when we remove the key will it trigger evicting the inodes and then when we
add the key back will the 'fscrypt_d_revalidate()' return 0 by checking the
'fscrypt_has_encryption_key()'.

As I remembered we have one or more fixes about this those days, not sure
whether you were hitting those bugs we have already fixed ?
Yeah, I remember now, and I guess there's yet another one here!

I'll look closer into this and see if I can find out something else.  I'm
definitely seeing 'fscrypt_d_revalidate()' returning 0, so probably the
bug is in the error paths, when the 'fsgqa' user tries to write into the
file.
Please add some debug logs in the code.
I *think* I've something.  The problem seems to be that, after the
drop_caches, the test directory is evicted and ceph_evict_inode() will
call fscrypt_put_encryption_info().  This last function will clear the
inode fscrypt info.  Later on, when the test tries to write to the file
with:

    _user_do "echo goo >> $my_test_subdir/data_coherency.txt"

function ceph_atomic_open() will correctly identify that '$my_test_subdir'
is encrypted, but the key isn't set because the inode was evicted.  This
means that fscrypt_has_encryption_key() will return '0' and DCACHE_NOKEY_NAME
will be *incorrectly* added to the 'data_coherency.txt' dentry flags.

Later on, ceph_d_revalidate() will see the problem I initially described.

The (RFC) patch bellow seems to fix the issue.  Basically, it will force
the fscrypt info to be set in the directory by calling __fscrypt_prepare_readdir()
and the fscrypt_has_encryption_key() will then return 'true'.
Interesting.

It's worth to add one separated commit to fix this.

Luis, could you send one patch to the mail list ? And please add the detail
comments in the code to explain it.

This will help us to under stand the code and to debug potential similar bugs in
future.
Sure, I'll do that.  In fact, I'll probably send out two patches as Jeff
also suggested a better name for __fscrypt_prepare_readdir().  Not sure
Eric will take it, but it's worth trying.
Sure. Thanks.
Cheers,




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux