Re: [RFC PATCH v3 2/4] ceph: handle encrypted snapshot names in subdirectories

Luís Henriques <lhenriques@xxxxxxx> · Fri, 18 Mar 2022 11:49:38 +0000

Xiubo Li <xiubli@xxxxxxxxxx> writes:

> On 3/18/22 6:53 PM, Luís Henriques wrote:
>> Xiubo Li <xiubli@xxxxxxxxxx> writes:
>>
>>> On 3/17/22 11:45 PM, Luís Henriques wrote:
>>>> When creating a snapshot, the .snap directories for every subdirectory will
>>>> show the snapshot name in the "long format":
>>>>
>>>>     # mkdir .snap/my-snap
>>>>     # ls my-dir/.snap/
>>>>     _my-snap_1099511627782
>>>>
>>>> Encrypted snapshots will need to be able to handle these snapshot names by
>>>> encrypting/decrypting only the snapshot part of the string ('my-snap').
>>>>
>>>> Also, since the MDS prevents snapshot names to be bigger than 240 characters
>>>> it is necessary to adapt CEPH_NOHASH_NAME_MAX to accommodate this extra
>>>> limitation.
>>>>
>>>> Signed-off-by: Luís Henriques <lhenriques@xxxxxxx>
>>>> ---
>>>>    fs/ceph/crypto.c | 189 ++++++++++++++++++++++++++++++++++++++++-------
>>>>    fs/ceph/crypto.h |  11 ++-
>>>>    2 files changed, 169 insertions(+), 31 deletions(-)
>>>>
>>>> diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c
>>>> index beb73bbdd868..caa9863dee93 100644
>>>> --- a/fs/ceph/crypto.c
>>>> +++ b/fs/ceph/crypto.c
>>>> @@ -128,16 +128,100 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se
>>>>    	swap(req->r_fscrypt_auth, as->fscrypt_auth);
>>>>    }
>>>>    -int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr
>>>> *d_name, char *buf)
>>>> +/*
>>>> + * User-created snapshots can't start with '_'.  Snapshots that start with this
>>>> + * character are special (hint: there aren't real snapshots) and use the
>>>> + * following format:
>>>> + *
>>>> + *   _<SNAPSHOT-NAME>_<INODE-NUMBER>
>>>> + *
>>>> + * where:
>>>> + *  - <SNAPSHOT-NAME> - the real snapshot name that may need to be decrypted,
>>>> + *  - <INODE-NUMBER> - the inode number for the actual snapshot
>>>> + *
>>>> + * This function parses these snapshot names and returns the inode
>>>> + * <INODE-NUMBER>.  'name_len' will also bet set with the <SNAPSHOT-NAME>
>>>> + * length.
>>>> + */
>>>> +static struct inode *parse_longname(const struct inode *parent, const char *name,
>>>> +				    int *name_len)
>>>>    {
>>>> +	struct inode *dir = NULL;
>>>> +	struct ceph_vino vino = { .snap = CEPH_NOSNAP };
>>>> +	char *inode_number;
>>>> +	char *name_end;
>>>> +	int orig_len = *name_len;
>>>> +	int ret = -EIO;
>>>> +
>>>> +	/* Skip initial '_' */
>>>> +	name++;
>>>> +	name_end = strrchr(name, '_');
>>>> +	if (!name_end) {
>>>> +		dout("Failed to parse long snapshot name: %s\n", name);
>>>> +		return ERR_PTR(-EIO);
>>>> +	}
>>>> +	*name_len = (name_end - name);
>>>> +	if (*name_len <= 0) {
>>>> +		pr_err("Failed to parse long snapshot name\n");
>>>> +		return ERR_PTR(-EIO);
>>>> +	}
>>>> +
>>>> +	/* Get the inode number */
>>>> +	inode_number = kmemdup_nul(name_end + 1,
>>>> +				   orig_len - *name_len - 2,
>>>> +				   GFP_KERNEL);
>>>> +	if (!inode_number)
>>>> +		return ERR_PTR(-ENOMEM);
>>>> +	ret = kstrtou64(inode_number, 0, &vino.ino);
>>>> +	if (ret) {
>>>> +		dout("Failed to parse inode number: %s\n", name);
>>>> +		dir = ERR_PTR(ret);
>>>> +		goto out;
>>>> +	}
>>>> +
>>>> +	/* And finally the inode */
>>>> +	dir = ceph_find_inode(parent->i_sb, vino);
>>>> +	if (!dir) {
>>>> +		/* This can happen if we're not mounting cephfs on the root */
>>>> +		dir = ceph_get_inode(parent->i_sb, vino, NULL);
>>> In this case IMO you should lookup the inode from MDS instead create it in the
>>> cache, which won't setup the encryption info needed.
>>>
>>> So later when you try to use this to dencrypt the snapshot names, you will hit
>>> errors ? And also the case Jeff mentioned in previous thread could happen.
>> No, I don't see any errors.  The reason is that if we get a I_NEW inode,
>> we do not have the keys to even decrypt the names.  If you mount a
>> filesystem using as root a directory that is inside an encrypted
>> directory, you'll see the encrypted snapshot name:
>>
>>   # mkdir mydir
>>   # fscrypt encrypt mydir
>>   # mkdir -p mydir/a/b/c/d
>>   # mkdir mydir/a/.snap/myspan
>>   # umount ...
>>   # mount <mon>:<port>:/a
>>   # ls .snap
>>
>> And we simply can't decrypt it because for that we'd need to have access
>> to the .fscrypt in the original filesystem mount root.
>
> Should we resolve this issue ? Something like try to copy the .fscrypt when
> mounting '/a' ?

I don't think this is an issue.  If an admin mounts a filesystem this way,
he must know what he's doing.  Being unable to decrypt a directory because
he picked the wrong root is a user error.  (Having documentation will
help, of course.)

Also, where would we copy the .fscrypt from?  You can run 'fscrypt setup'
as many times as you want in a single cephfs, you simply need to use
different root directories.  So, yeah, my opinion is that we simply need
to hand this gracefully in the client.

Cheers,
-- 
Luís