On Wed, 2020-03-04 at 14:24 -0500, Jeff Layton wrote: > On Wed, 2020-03-04 at 12:48 -0500, Jeff Layton wrote: > > On Thu, 2020-03-05 at 01:32 +0800, Yan, Zheng wrote: > > > This series make cephfs client not request caps for open files that > > > idle for a long time. For the case that one active client and multiple > > > standby clients open the same file, this increase the possibility that > > > mds issues exclusive caps to the active client. > > > > > > Yan, Zheng (6): > > > ceph: always renew caps if mds_wanted is insufficient > > > ceph: consider inode's last read/write when calculating wanted caps > > > ceph: remove delay check logic from ceph_check_caps() > > > ceph: simplify calling of ceph_get_fmode() > > > ceph: update i_requested_max_size only when sending cap msg to auth mds > > > ceph: check all mds' caps after page writeback > > > > > > fs/ceph/caps.c | 338 ++++++++++++++++------------------- > > > fs/ceph/file.c | 45 ++--- > > > fs/ceph/inode.c | 21 +-- > > > fs/ceph/ioctl.c | 2 + > > > fs/ceph/mds_client.c | 5 - > > > fs/ceph/super.h | 37 ++-- > > > include/linux/ceph/ceph_fs.h | 1 + > > > 7 files changed, 202 insertions(+), 247 deletions(-) > > > > > > changes since v2 > > > - make __ceph_caps_file_wanted() more readable > > > - add patch 5 and 6, which fix hung write during testing patch 1~4 > > > > > > changes since v3 > > > - don't queue delayed cap check for snap inode > > > - initialize ci->{last_rd,last_wr} to jiffies - 3600 * HZ > > > - make __ceph_caps_file_wanted() check inode type > > > > > > > I just tried this out, and it still seems to kill unlink performance > > with -o nowsync. From the script I posted to the other thread: > > > > --------8<-------- > > $ ./test-async-dirops.sh > > File: /mnt/cephfs/test-dirops.1401 > > Size: 0 Blocks: 0 IO Block: 65536 directory > > Device: 26h/38d Inode: 1099511627778 Links: 2 > > Access: (0775/drwxrwxr-x) Uid: ( 4447/ jlayton) Gid: ( 4447/ jlayton) > > Context: system_u:object_r:cephfs_t:s0 > > Access: 2020-03-04 12:42:03.914006772 -0500 > > Modify: 2020-03-04 12:42:03.914006772 -0500 > > Change: 2020-03-04 12:42:03.914006772 -0500 > > Birth: 2020-03-04 12:42:03.914006772 -0500 > > Creating files in /mnt/cephfs/test-dirops.1401 > > > > real 0m6.269s > > user 0m0.123s > > sys 0m0.454s > > > > sync > > > > real 0m5.358s > > user 0m0.003s > > sys 0m0.011s > > Starting rm > > > > real 0m18.932s > > user 0m0.169s > > sys 0m0.713s > > > > rmdir > > > > real 0m0.135s > > user 0m0.000s > > sys 0m0.002s > > > > sync > > > > real 0m1.725s > > user 0m0.000s > > sys 0m0.002s > > --------8<-------- > > > > Create and sync parts look reasonable. Usually, the rm part finishes in > > less than a second as we end up doing most of the removes > > asynchronously, but here it didn't. I suspect that means that unlink > > didn't get caps for some reason, but I'd need to do some debugging to > > figure out what's actually happening. > > > > A bit more investigation. I piled on some tracepoints that I had plumbed > in a while back: > > It starts with doing making a directory (NOSNAP:0x1000000c6c0) and then > doing the first sync create: > > kworker/7:1-2021 [007] .... 5316.161752: ceph_add_cap: ino=NOSNAP:0x1 mds=0 issued=pAsLsXs implemented=pAsLsXs mds_wanted=p > kworker/7:1-2021 [007] .... 5316.163241: ceph_add_cap: ino=NOSNAP:0x1 mds=0 issued=pAsLsXs implemented=pAsLsXs mds_wanted=p > kworker/7:1-2021 [007] .... 5316.163250: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsxLsXsxFsx implemented=pAsxLsXsxFsx mds_wanted=- > kworker/7:1-2021 [007] .... 5316.168011: ceph_handle_cap_grant: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsxFsx mds_wanted=p > kworker/7:1-2021 [007] .... 5316.168830: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=pAsLsXsFsxcral > kworker/7:1-2021 [007] .... 5316.168839: ceph_add_cap: ino=NOSNAP:0x1000000c6c1 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb > test-async-diro-2257 [005] .... 5316.169004: ceph_sync_create: name=NOSNAP:0x1000000c6c0/1(0x1000000c6c1) > > We get a cap grant on the directory in the sync create reply and that sets mds_wanted: > > kworker/7:1-2021 [007] .... 5316.169075: ceph_handle_cap_grant: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsxc implemented=pAsLsXsFsxc mds_wanted=pAsLsXsFsxcral > kworker/7:1-2021 [007] .... 5316.170209: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsxc implemented=pAsLsXsFsxc mds_wanted=pAsLsXsFsxcral > > ...eventually we start doing async creates and continue that to the end > of the creates. (Side note: the second create was also synchronous -- > not sure why): > > kworker/7:1-2021 [007] .... 5316.170216: ceph_add_cap: ino=NOSNAP:0x1000000c6c2 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb > test-async-diro-2257 [005] .... 5316.170251: ceph_sync_create: name=NOSNAP:0x1000000c6c0/2(0x1000000c6c2) > test-async-diro-2257 [005] .... 5316.170412: ceph_add_cap: ino=NOSNAP:0x1000000c554 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxLsXsxFsxcrwb > test-async-diro-2257 [005] .... 5316.170415: ceph_async_create: name=NOSNAP:0x1000000c6c0/3(0x1000000c554) > test-async-diro-2257 [005] .... 5316.170571: ceph_add_cap: ino=NOSNAP:0x1000000c555 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxLsXsxFsxcrwb > > Eventually we finish creating and soon after, get a cap revoke and that > resets mds_wanted back to only "p": > > kworker/7:1-2021 [007] .... 5316.461890: ceph_handle_cap_grant: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFs implemented=pAsLsXsFsc mds_wanted=p > > ...and it never changes and we're never granted Du caps (aka Fr): > > rm-2262 [006] .... 5316.466221: ceph_sync_unlink: name=NOSNAP:0x1000000c6c0/1(0x1000000c6c1) > kworker/7:1-2021 [007] .... 5316.467108: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p > kworker/7:1-2021 [007] .... 5316.467676: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p > kworker/7:1-2021 [007] .... 5316.467679: ceph_add_cap: ino=NOSNAP:0x1000000c55b mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb > rm-2262 [006] .... 5316.467714: ceph_sync_unlink: name=NOSNAP:0x1000000c6c0/10(0x1000000c55b) > kworker/7:1-2021 [007] .... 5316.468424: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p > kworker/7:1-2021 [007] .... 5316.468956: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p > kworker/7:1-2021 [007] .... 5316.468960: ceph_add_cap: ino=NOSNAP:0x1000000c5b5 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb > rm-2262 [006] .... 5316.468994: ceph_sync_unlink: name=NOSNAP:0x1000000c6c0/100(0x1000000c5b5) > kworker/7:1-2021 [007] .... 5316.469728: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p > kworker/7:1-2021 [007] .... 5316.470237: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p > kworker/7:1-2021 [007] .... 5316.470242: ceph_add_cap: ino=NOSNAP:0x1000000c55c mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb > > Given that we've overloaded Fr caps for Du, I wonder if the new timeout > logic is creeping in here. In particular, we don't want those caps to > ever time out. > I did a bisect and the first two patches in the series seem to be fine. This patch causes the problem: [PATCH v4 3/6] ceph: remove delay check logic from ceph_check_caps() ...though I've not yet spotted the problem. -- Jeff Layton <jlayton@xxxxxxxxxx>