Re: [PATCH v4 0/6] ceph: don't request caps for idle open files

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, 2020-03-04 at 14:24 -0500, Jeff Layton wrote:
> On Wed, 2020-03-04 at 12:48 -0500, Jeff Layton wrote:
> > On Thu, 2020-03-05 at 01:32 +0800, Yan, Zheng wrote:
> > > This series make cephfs client not request caps for open files that
> > > idle for a long time. For the case that one active client and multiple
> > > standby clients open the same file, this increase the possibility that
> > > mds issues exclusive caps to the active client.
> > > 
> > > Yan, Zheng (6):
> > >   ceph: always renew caps if mds_wanted is insufficient
> > >   ceph: consider inode's last read/write when calculating wanted caps
> > >   ceph: remove delay check logic from ceph_check_caps()
> > >   ceph: simplify calling of ceph_get_fmode()
> > >   ceph: update i_requested_max_size only when sending cap msg to auth mds
> > >   ceph: check all mds' caps after page writeback
> > > 
> > >  fs/ceph/caps.c               | 338 ++++++++++++++++-------------------
> > >  fs/ceph/file.c               |  45 ++---
> > >  fs/ceph/inode.c              |  21 +--
> > >  fs/ceph/ioctl.c              |   2 +
> > >  fs/ceph/mds_client.c         |   5 -
> > >  fs/ceph/super.h              |  37 ++--
> > >  include/linux/ceph/ceph_fs.h |   1 +
> > >  7 files changed, 202 insertions(+), 247 deletions(-)
> > > 
> > > changes since v2
> > >  - make __ceph_caps_file_wanted() more readable
> > >  - add patch 5 and 6, which fix hung write during testing patch 1~4
> > > 
> > > changes since v3
> > >  - don't queue delayed cap check for snap inode
> > >  - initialize ci->{last_rd,last_wr} to jiffies - 3600 * HZ
> > >  - make __ceph_caps_file_wanted() check inode type
> > > 
> > 
> > I just tried this out, and it still seems to kill unlink performance
> > with -o nowsync. From the script I posted to the other thread:
> > 
> > --------8<--------
> > $ ./test-async-dirops.sh 
> >   File: /mnt/cephfs/test-dirops.1401
> >   Size: 0         	Blocks: 0          IO Block: 65536  directory
> > Device: 26h/38d	Inode: 1099511627778  Links: 2
> > Access: (0775/drwxrwxr-x)  Uid: ( 4447/ jlayton)   Gid: ( 4447/ jlayton)
> > Context: system_u:object_r:cephfs_t:s0
> > Access: 2020-03-04 12:42:03.914006772 -0500
> > Modify: 2020-03-04 12:42:03.914006772 -0500
> > Change: 2020-03-04 12:42:03.914006772 -0500
> >  Birth: 2020-03-04 12:42:03.914006772 -0500
> > Creating files in /mnt/cephfs/test-dirops.1401
> > 
> > real	0m6.269s
> > user	0m0.123s
> > sys	0m0.454s
> > 
> > sync
> > 
> > real	0m5.358s
> > user	0m0.003s
> > sys	0m0.011s
> > Starting rm
> > 
> > real	0m18.932s
> > user	0m0.169s
> > sys	0m0.713s
> > 
> > rmdir
> > 
> > real	0m0.135s
> > user	0m0.000s
> > sys	0m0.002s
> > 
> > sync
> > 
> > real	0m1.725s
> > user	0m0.000s
> > sys	0m0.002s
> > --------8<--------
> > 
> > Create and sync parts look reasonable. Usually, the rm part finishes in
> > less than a second as we end up doing most of the removes
> > asynchronously, but here it didn't. I suspect that means that unlink
> > didn't get caps for some reason, but I'd need to do some debugging to
> > figure out what's actually happening.
> > 
> 
> A bit more investigation. I piled on some tracepoints that I had plumbed
> in a while back:
> 
> It starts with doing making a directory (NOSNAP:0x1000000c6c0) and then
> doing the first sync create:
> 
>      kworker/7:1-2021  [007] ....  5316.161752: ceph_add_cap: ino=NOSNAP:0x1 mds=0 issued=pAsLsXs implemented=pAsLsXs mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.163241: ceph_add_cap: ino=NOSNAP:0x1 mds=0 issued=pAsLsXs implemented=pAsLsXs mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.163250: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsxLsXsxFsx implemented=pAsxLsXsxFsx mds_wanted=-
>      kworker/7:1-2021  [007] ....  5316.168011: ceph_handle_cap_grant: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsxFsx mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.168830: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=pAsLsXsFsxcral
>      kworker/7:1-2021  [007] ....  5316.168839: ceph_add_cap: ino=NOSNAP:0x1000000c6c1 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb
>  test-async-diro-2257  [005] ....  5316.169004: ceph_sync_create: name=NOSNAP:0x1000000c6c0/1(0x1000000c6c1)
> 
> We get a cap grant on the directory in the sync create reply and that sets mds_wanted:
> 
>      kworker/7:1-2021  [007] ....  5316.169075: ceph_handle_cap_grant: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsxc implemented=pAsLsXsFsxc mds_wanted=pAsLsXsFsxcral
>      kworker/7:1-2021  [007] ....  5316.170209: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsxc implemented=pAsLsXsFsxc mds_wanted=pAsLsXsFsxcral
> 
> ...eventually we start doing async creates and continue that to the end
> of the creates. (Side note: the second create was also synchronous --
> not sure why):
> 
>      kworker/7:1-2021  [007] ....  5316.170216: ceph_add_cap: ino=NOSNAP:0x1000000c6c2 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb
>  test-async-diro-2257  [005] ....  5316.170251: ceph_sync_create: name=NOSNAP:0x1000000c6c0/2(0x1000000c6c2)
>  test-async-diro-2257  [005] ....  5316.170412: ceph_add_cap: ino=NOSNAP:0x1000000c554 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxLsXsxFsxcrwb
>  test-async-diro-2257  [005] ....  5316.170415: ceph_async_create: name=NOSNAP:0x1000000c6c0/3(0x1000000c554)
>  test-async-diro-2257  [005] ....  5316.170571: ceph_add_cap: ino=NOSNAP:0x1000000c555 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxLsXsxFsxcrwb
> 
> Eventually we finish creating and soon after, get a cap revoke and that
> resets mds_wanted back to only "p":
> 
>      kworker/7:1-2021  [007] ....  5316.461890: ceph_handle_cap_grant: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFs implemented=pAsLsXsFsc mds_wanted=p
> 
> ...and it never changes and we're never granted Du caps (aka Fr):
> 
>               rm-2262  [006] ....  5316.466221: ceph_sync_unlink: name=NOSNAP:0x1000000c6c0/1(0x1000000c6c1)
>      kworker/7:1-2021  [007] ....  5316.467108: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.467676: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.467679: ceph_add_cap: ino=NOSNAP:0x1000000c55b mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb
>               rm-2262  [006] ....  5316.467714: ceph_sync_unlink: name=NOSNAP:0x1000000c6c0/10(0x1000000c55b)
>      kworker/7:1-2021  [007] ....  5316.468424: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.468956: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.468960: ceph_add_cap: ino=NOSNAP:0x1000000c5b5 mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb
>               rm-2262  [006] ....  5316.468994: ceph_sync_unlink: name=NOSNAP:0x1000000c6c0/100(0x1000000c5b5)
>      kworker/7:1-2021  [007] ....  5316.469728: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.470237: ceph_add_cap: ino=NOSNAP:0x1000000c6c0 mds=0 issued=pAsLsXsFsx implemented=pAsLsXsFsx mds_wanted=p
>      kworker/7:1-2021  [007] ....  5316.470242: ceph_add_cap: ino=NOSNAP:0x1000000c55c mds=0 issued=pAsxLsXsxFsxcrwb implemented=pAsxLsXsxFsxcrwb mds_wanted=pAsxXsxFxwb
> 
> Given that we've overloaded Fr caps for Du, I wonder if the new timeout
> logic is creeping in here. In particular, we don't want those caps to
> ever time out.
> 

I did a bisect and the first two patches in the series seem to be fine.
This patch causes the problem:

     [PATCH v4 3/6] ceph: remove delay check logic from ceph_check_caps()

...though I've not yet spotted the problem.
-- 
Jeff Layton <jlayton@xxxxxxxxxx>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux