Re: [PATCH v16 25/68] ceph: make d_revalidate call fscrypt revalidator for encrypted dentries

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2023-03-08 at 18:30 +0000, Luís Henriques wrote:
> Jeff Layton <jlayton@xxxxxxxxxx> writes:
> 
> > On Wed, 2023-03-08 at 17:14 +0000, Luís Henriques wrote:
> > > Xiubo Li <xiubli@xxxxxxxxxx> writes:
> > > 
> > > > On 08/03/2023 17:29, Luís Henriques wrote:
> > > > > Xiubo Li <xiubli@xxxxxxxxxx> writes:
> > > > > 
> > > > > > On 08/03/2023 02:53, Luís Henriques wrote:
> > > > > > > xiubli@xxxxxxxxxx writes:
> > > > > > > 
> > > > > > > > From: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > > > > 
> > > > > > > > If we have a dentry which represents a no-key name, then we need to test
> > > > > > > > whether the parent directory's encryption key has since been added.  Do
> > > > > > > > that before we test anything else about the dentry.
> > > > > > > > 
> > > > > > > > Reviewed-by: Xiubo Li <xiubli@xxxxxxxxxx>
> > > > > > > > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > > > > > > > ---
> > > > > > > >     fs/ceph/dir.c | 8 ++++++--
> > > > > > > >     1 file changed, 6 insertions(+), 2 deletions(-)
> > > > > > > > 
> > > > > > > > diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c
> > > > > > > > index d3c2853bb0f1..5ead9f59e693 100644
> > > > > > > > --- a/fs/ceph/dir.c
> > > > > > > > +++ b/fs/ceph/dir.c
> > > > > > > > @@ -1770,6 +1770,10 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags)
> > > > > > > >     	struct inode *dir, *inode;
> > > > > > > >     	struct ceph_mds_client *mdsc;
> > > > > > > >     +	valid = fscrypt_d_revalidate(dentry, flags);
> > > > > > > > +	if (valid <= 0)
> > > > > > > > +		return valid;
> > > > > > > > +
> > > > > > > This patch has confused me in the past, and today I found myself
> > > > > > > scratching my head again looking at it.
> > > > > > > 
> > > > > > > So, I've started seeing generic/123 test failing when running it with
> > > > > > > test_dummy_encryption.  I was almost sure that this test used to run fine
> > > > > > > before, but I couldn't find any evidence (somehow I lost my old testing
> > > > > > > logs...).
> > > > > > > 
> > > > > > > Anyway, the test is quite simple:
> > > > > > > 
> > > > > > > 1. Creates a directory with write permissions for root only
> > > > > > > 2. Writes into a file in that directory
> > > > > > > 3. Uses 'su' to try to modify that file as a different user, and
> > > > > > >       gets -EPERM
> > > > > > > 
> > > > > > > All these steps run fine, and the test should pass.  *However*, in the
> > > > > > > test cleanup function, a simple 'rm -rf <dir>' will fail with -ENOTEMPTY.
> > > > > > > 'strace' shows that calling unlinkat() to remove the file got a '-ENOENT'
> > > > > > > and then -ENOTEMPTY for the directory.
> > > > > > > 
> > > > > > > Some digging allowed me to figure out that running commands with 'su' will
> > > > > > > drop caches (I see 'su (874): drop_caches: 2' in the log).  And this is
> > > > > > > how I ended up looking at this patch.  fscrypt_d_revalidate() will return
> > > > > > > '0' if the parent directory does has a key (fscrypt_has_encryption_key()).
> > > > > > > Can we really say here that the dentry is *not* valid in that case?  Or
> > > > > > > should that '<= 0' be a '< 0'?
> > > > > > > 
> > > > > > > (But again, this patch has confused me before...)
> > > > > > Luis,
> > > > > > 
> > > > > > Could you reproduce it with the latest testing branch ?
> > > > > Yes, I'm seeing this with the latest code.
> > > > 
> > > > Okay. That's odd.
> > > > 
> > > > BTW, are you using the non-root user to run the test ?
> > > > 
> > > > Locally I am using the root user and still couldn't reproduce it.
> > > 
> > > Yes, I'm running the tests as root but I've also 'fsgqa' user in the
> > > system (which is used by this test.  Anyway, for reference, here's what
> > > I'm using in my fstests configuration:
> > > 
> > > TEST_FS_MOUNT_OPTS="-o name=admin,secret=<key>,copyfrom,ms_mode=crc,test_dummy_encryption"
> > > MOUNT_OPTIONS="-o name=admin,secret=<key>,copyfrom,ms_mode=crc,test_dummy_encryption"
> > > 
> > > > > 
> > > > > > I never seen the generic/123 failure yet. And just now I ran the test for many
> > > > > > times locally it worked fine.
> > > > > That's odd.  With 'test_dummy_encryption' mount option I can reproduce it
> > > > > every time.
> > > > > 
> > > > > >   From the generic/123 test code it will never touch the key while testing, that
> > > > > > means the dentries under the test dir will always have the keyed name. And then
> > > > > > the 'fscrypt_d_revalidate()' should return 1 always.
> > > > > > 
> > > > > > Only when we remove the key will it trigger evicting the inodes and then when we
> > > > > > add the key back will the 'fscrypt_d_revalidate()' return 0 by checking the
> > > > > > 'fscrypt_has_encryption_key()'.
> > > > > > 
> > > > > > As I remembered we have one or more fixes about this those days, not sure
> > > > > > whether you were hitting those bugs we have already fixed ?
> > > > > Yeah, I remember now, and I guess there's yet another one here!
> > > > > 
> > > > > I'll look closer into this and see if I can find out something else.  I'm
> > > > > definitely seeing 'fscrypt_d_revalidate()' returning 0, so probably the
> > > > > bug is in the error paths, when the 'fsgqa' user tries to write into the
> > > > > file.
> > > > 
> > > > Please add some debug logs in the code.
> > > 
> > > I *think* I've something.  The problem seems to be that, after the
> > > drop_caches, the test directory is evicted and ceph_evict_inode() will
> > > call fscrypt_put_encryption_info().  This last function will clear the
> > > inode fscrypt info.  Later on, when the test tries to write to the file
> > > with:
> > > 
> > >   _user_do "echo goo >> $my_test_subdir/data_coherency.txt"
> > > 
> > > function ceph_atomic_open() will correctly identify that '$my_test_subdir'
> > > is encrypted, but the key isn't set because the inode was evicted.  This
> > > means that fscrypt_has_encryption_key() will return '0' and DCACHE_NOKEY_NAME
> > > will be *incorrectly* added to the 'data_coherency.txt' dentry flags.
> > > 
> > > Later on, ceph_d_revalidate() will see the problem I initially described.
> > > 
> > > The (RFC) patch bellow seems to fix the issue.  Basically, it will force
> > > the fscrypt info to be set in the directory by calling __fscrypt_prepare_readdir()
> > > and the fscrypt_has_encryption_key() will then return 'true'.
> > > 
> > 
> > 
> > > Cheers
> > > --
> > > Luís
> > > 
> > > diff --git a/fs/ceph/file.c b/fs/ceph/file.c
> > > index dee3b445f415..3f2df84a6323 100644
> > > --- a/fs/ceph/file.c
> > > +++ b/fs/ceph/file.c
> > > @@ -795,7 +795,8 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry,
> > >  	ihold(dir);
> > >  	if (IS_ENCRYPTED(dir)) {
> > >  		set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags);
> > > -		if (!fscrypt_has_encryption_key(dir)) {
> > > +		err = __fscrypt_prepare_readdir(dir);
> > 
> > I want to say that i had something like this in place during an earlier
> > version of this series, but for different reasons. I think I convinced
> > myself later though that it wasn't needed? Oh well...
> 
> Ah, good to know it _may_ make sense :-)
> 
> > > +		if (err || (!err && !fscrypt_has_encryption_key(dir))) {
> > >  			spin_lock(&dentry->d_lock);
> > >  			dentry->d_flags |= DCACHE_NOKEY_NAME;
> > >  			spin_unlock(&dentry->d_lock);
> > 
> > Once an inode is evicted, my understanding was that it won't end up
> > being used anymore. It's on its way out of the cache and it's not hashed
> > anymore at that point.
> > 
> > How does a new atomic open after drop_caches end up with the inode
> > struct that existed before it?
> 
> Hmm... so, I *think* that what's happening is that it is a new inode but
> the key is still available.  Looking at the code it seems that fscrypt
> will get the context (->get_context()) from ceph code and then
> fscrypt_setup_encryption_info() should initialize everything in the
> inode.  And at that point fscrypt_has_encryption_key() will finally return
> 'true'.
> 
> Does this make sense?
> 

Yeah, I think so. This is also coming back to me a bit too...

Basically none of the existing fscrypt-supporting filesystems deal with
atomic_open, so we need to do *something* in this codepath to ensure
that the key is available if the parent is encrypted. The regular open
path, we call fscrypt_file_open to ensure that, but we don't have the
inode for the thing yet at this point.

__fscrypt_preapre_readdir is what we need here (though that really needs
a new name since it's not just for readdir).

Reviewed-by: Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux