Re: [PATCH 4/5] ceph: flush the mdlog before waiting on unsafe reqs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Xiubo,

On Wed, Jun 30, 2021 at 6:16 PM Xiubo Li <xiubli@xxxxxxxxxx> wrote:
> >> Normally the mdlog submit thread will be triggered per MDS's tick,
> >> that's 5 seconds. But this is not always true mostly because any other
> >> client request could trigger the mdlog submit thread to run at any time.
> >> Since the fsync is not running all the time, so IMO the performance
> >> impact should be okay.
> >>
> >>
> > I'm not sure I'm convinced.
> >
> > Consider a situation where we have a large(ish) ceph cluster with
> > several MDSs. One client is writing to a file that is on mds.0 and there
> > is little other activity there. Several other clients are doing heavy
> > I/O on other inodes (of which mds.1 is auth).
> >
> > The first client then calls fsync, and now the other clients stall for a
> > bit while mds.1 unnecessarily flushes its mdlog. I think we need to take
> > care to only flush the mdlog for mds's that we care about here.
>
> Okay, except the above case I mentioned I didn't find any case that
> could prevent us doing this.
>
> Let me test more about it by just flushing the mdlog in auth MDS.

I think Jeff raises a good point. I looked at the history of the
flush_mdlog session command. It was used to implement syncfs which
necessarily requires all MDS to flush their journals (at least those
MDS communicating with the client).

During my testing of the original bug I found that running "stat ."
prior to fsync caused the hang to go away. This is because the stat
forced the MDS to flush its log in order to issue new caps to the
client. I think we need to understand that behavior better: the MDS is
revoking caps on the client to execute the rename RPC. It is probably
sufficient to change fsync to getattr appropriate (read/shared) caps
instead of flush the MDS journal.

Your time on adding flush_mdlog is not wasted; I think a good followup
patch is to add syncfs support the same way ceph-fuse does.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat Sunnyvale, CA
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D




[Index of Archives]     [CEPH Users]     [Ceph Large]     [Ceph Dev]     [Information on CEPH]     [Linux BTRFS]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux