Re: [PATCH] ceph: request expedited service when flushing caps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 31, 2020 at 6:49 AM Yan, Zheng <ukernel@xxxxxxxxx> wrote:
>
> On Tue, Mar 31, 2020 at 6:52 PM Jeff Layton <jlayton@xxxxxxxxxx> wrote:
> >
> > Jan noticed some long stalls when flushing caps using sync() after
> > doing small file creates. For instance, running this:
> >
> >     $ time for i in $(seq -w 11 30); do echo "Hello World" > hello-$i.txt; sync -f ./hello-$i.txt; done
> >
> > Could take more than 90s in some cases. The sync() will flush out caps,
> > but doesn't tell the MDS that it's waiting synchronously on the
> > replies.
> >
> > When ceph_check_caps finds that CHECK_CAPS_FLUSH is set, then set the
> > CEPH_CLIENT_CAPS_SYNC bit in the cap update request. This clues the MDS
> > into that fact and it can then expedite the reply.
> >
> > URL: https://tracker.ceph.com/issues/44744
> > Reported-and-Tested-by: Jan Fajerski <jfajerski@xxxxxxxx>
> > Signed-off-by: Jeff Layton <jlayton@xxxxxxxxxx>
> > ---
> >  fs/ceph/caps.c | 7 +++++--
> >  1 file changed, 5 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c
> > index 61808793e0c0..6403178f2376 100644
> > --- a/fs/ceph/caps.c
> > +++ b/fs/ceph/caps.c
> > @@ -2111,8 +2111,11 @@ void ceph_check_caps(struct ceph_inode_info *ci, int flags,
> >
> >                 mds = cap->mds;  /* remember mds, so we don't repeat */
> >
> > -               __prep_cap(&arg, cap, CEPH_CAP_OP_UPDATE, 0, cap_used, want,
> > -                          retain, flushing, flush_tid, oldest_flush_tid);
> > +               __prep_cap(&arg, cap, CEPH_CAP_OP_UPDATE,
> > +                          (flags & CHECK_CAPS_FLUSH) ?
> > +                           CEPH_CLIENT_CAPS_SYNC : 0,
> > +                          cap_used, want, retain, flushing, flush_tid,
> > +                          oldest_flush_tid);
> >                 spin_unlock(&ci->i_ceph_lock);
> >
>
> this is too expensive for syncfs case. mds needs to flush journal for
> each dirty inode.  we'd better to track dirty inodes by session, and
> only set the flag when flushing the last inode in session dirty list.

Yeah, see the userspace Client::_sync_fs() where we have an internal
flags argument which is set on the last cap in the dirty set and tells
the actual cap message flushing code to set FLAG_SYNC on the
MClientCaps message. I presume the kernel is operating on a similar
principle here?
-Greg


>
>
> >                 __send_cap(mdsc, &arg, ci);
> > --
> > 2.25.1
> >
>




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux