Re: [PATCH v5 0/8] ceph: size handling for the fscrypt

Jeff Layton <jlayton@xxxxxxxxxx> · Wed, 03 Nov 2021 08:56:03 -0400

On Wed, 2021-11-03 at 09:22 +0800, xiubli@xxxxxxxxxx wrote:
> From: Jeff Layton <jlayton@xxxxxxxxxx>
> 
> This patch series is based on the "wip-fscrypt-fnames" branch in
> repo https://github.com/ceph/ceph-client.git.
> 
> And I have picked up 5 patches from the "ceph-fscrypt-size-experimental"
> branch in repo
> https://git.kernel.org/pub/scm/linux/kernel/git/jlayton/linux.git.
> 
> ====
> 
> This approach is based on the discussion from V1 and V2, which will
> pass the encrypted last block contents to MDS along with the truncate
> request.
> 
> This will send the encrypted last block contents to MDS along with
> the truncate request when truncating to a smaller size and at the
> same time new size does not align to BLOCK SIZE.
> 
> The MDS side patch is raised in PR
> https://github.com/ceph/ceph/pull/43588, which is also based Jeff's
> previous great work in PR https://github.com/ceph/ceph/pull/41284.
> 
> The MDS will use the filer.write_trunc(), which could update and
> truncate the file in one shot, instead of filer.truncate().
> 
> This just assume kclient won't support the inline data feature, which
> will be remove soon, more detail please see:
> https://tracker.ceph.com/issues/52916
> 
> Changed in V5:
> - Rebase to "wip-fscrypt-fnames" branch in ceph-client.git repo.
> - Pick up 5 patches from Jeff's "ceph-fscrypt-size-experimental" branch
>   in linux.git repo.
> - Add "i_truncate_pagecache_size" member support in ceph_inode_info
>   struct, this will be used to truncate the pagecache only in kclient
>   side, because the "i_truncate_size" will always be aligned to BLOCK
>   SIZE. In fscrypt case we need to use the real size to truncate the
>   pagecache.
> 
> 
> Changed in V4:
> - Retry the truncate request by 20 times before fail it with -EAGAIN.
> - Remove the "fill_last_block" label and move the code to else branch.
> - Remove the #3 patch, which has already been sent out separately, in
>   V3 series.
> - Improve some comments in the code.
> 
> Changed in V3:
> - Fix possibly corrupting the file just before the MDS acquires the
>   xlock for FILE lock, another client has updated it.
> - Flush the pagecache buffer before reading the last block for the
>   when filling the truncate request.
> - Some other minore fixes.
> 
> 
> 
> Jeff Layton (5):
>   libceph: add CEPH_OSD_OP_ASSERT_VER support
>   ceph: size handling for encrypted inodes in cap updates
>   ceph: fscrypt_file field handling in MClientRequest messages
>   ceph: get file size from fscrypt_file when present in inode traces
>   ceph: handle fscrypt fields in cap messages from MDS
> 
> Xiubo Li (3):
>   ceph: add __ceph_get_caps helper support
>   ceph: add __ceph_sync_read helper support
>   ceph: add truncate size handling support for fscrypt
> 
>  fs/ceph/caps.c                  | 136 ++++++++++++++----
>  fs/ceph/crypto.h                |   4 +
>  fs/ceph/dir.c                   |   3 +
>  fs/ceph/file.c                  |  43 ++++--
>  fs/ceph/inode.c                 | 236 +++++++++++++++++++++++++++++---
>  fs/ceph/mds_client.c            |   9 +-
>  fs/ceph/mds_client.h            |   2 +
>  fs/ceph/super.h                 |  10 ++
>  include/linux/ceph/crypto.h     |  28 ++++
>  include/linux/ceph/osd_client.h |   6 +-
>  include/linux/ceph/rados.h      |   4 +
>  net/ceph/osd_client.c           |   5 +
>  12 files changed, 427 insertions(+), 59 deletions(-)
>  create mode 100644 include/linux/ceph/crypto.h
> 

Thanks Xiubo,

This looks like a great start. I set up an environment vs. a cephadm
cluster with your fscrypt changes, and started running xfstests against
it with test_dummy_encryption enabled. It got to generic/014 and the
test hung waiting on a SETATTR call to come back:

[root@client1 f3cf8b7a-38ec-11ec-a0e4-52540031ba78.client74208]# cat mdsc
89447	mds0	setattr	 #1000003b19c

Looking at the MDS that it was talking to, I see:

Nov 03 08:25:09 cephadm2 ceph-mds[3133]: log_channel(cluster) log [WRN] : 1 slow requests, 1 included below; oldest blocked for > 31.627241 secs
Nov 03 08:25:09 cephadm2 ceph-mds[3133]: log_channel(cluster) log [WRN] : slow request 31.627240 seconds old, received at 2021-11-03T12:24:37.911553+0000: client_request(client.74208:89447 setattr size=102498304 #0x1000003b19c 2021-11-03T12:24:37.895292+0000 caller_uid=0, caller_gid=0{0,}) currently acquired locks
Nov 03 08:25:14 cephadm2 ceph-mds[3133]: log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 36.627323 secs
Nov 03 08:25:19 cephadm2 ceph-mds[3133]: log_channel(cluster) log [WRN] : 1 slow requests, 0 included below; oldest blocked for > 41.627389 secs

...and it still hasn't resolved.

I'll keep looking around a bit more, but I think there are still some
bugs in here. Let me know if you have thoughts as to what the issue is.

Thanks,
-- 
Jeff Layton <jlayton@xxxxxxxxxx>