Re: CephFS constant high write I/O to the metadata pool

Olli Rajala <olli.rajala@xxxxxxxx> · Thu, 10 Nov 2022 23:35:55 +0200

Hi Venky,

I have indeed observed the output of the different sections of perf dump
like so:
watch -n 1 ceph tell mds.`hostname` perf dump objecter
watch -n 1 ceph tell mds.`hostname` perf dump mds_cache
...etc...

...but without any proper understanding of what is a normal rate for some
number to go up it's really difficult to make anything from that.

btw - is there some convenient way to capture this kind of temporal output
for others to view. Sure, I could just dump once a second to a file or
sequential files but is there some tool or convention that is easy to look
at and analyze?

Tnx,
---------------------------
Olli Rajala - Lead TD
Anima Vitae Ltd.
www.anima.fi
---------------------------

On Thu, Nov 10, 2022 at 8:18 AM Venky Shankar <vshankar@xxxxxxxxxx> wrote:

> Hi Olli,
>
> On Mon, Oct 17, 2022 at 1:08 PM Olli Rajala <olli.rajala@xxxxxxxx> wrote:
> >
> > Hi Patrick,
> >
> > With "objecter_ops" did you mean "ceph tell mds.pve-core-1 ops" and/or
> > "ceph tell mds.pve-core-1 objecter_requests"? Both these show very few
> > requests/ops - many times just returning empty lists. I'm pretty sure
> that
> > this I/O isn't generated by any clients - I've earlier tried to isolate
> > this by shutting down all cephfs clients and this didn't have any
> > noticeable effect.
> >
> > I tried to watch what is going on with that "perf dump" but to be honest
> > all I can see is some numbers going up in the different sections :)
> > ...don't have a clue what to focus on and how to interpret that.
> >
> > Here's a perf dump if you or anyone could make something out of that:
> > https://gist.github.com/olliRJL/43c10173aafd82be22c080a9cd28e673
>
> You'd need to capture this over a period of time to see what ops might
> be going through and what the mds is doing.
>
> >
> > Tnx!
> > o.
> >
> > ---------------------------
> > Olli Rajala - Lead TD
> > Anima Vitae Ltd.
> > www.anima.fi
> > ---------------------------
> >
> >
> > On Fri, Oct 14, 2022 at 8:32 PM Patrick Donnelly <pdonnell@xxxxxxxxxx>
> > wrote:
> >
> > > Hello Olli,
> > >
> > > On Thu, Oct 13, 2022 at 5:01 AM Olli Rajala <olli.rajala@xxxxxxxx>
> wrote:
> > > >
> > > > Hi,
> > > >
> > > > I'm seeing constant 25-50MB/s writes to the metadata pool even when
> all
> > > > clients and the cluster is idling and in clean state. This surely
> can't
> > > be
> > > > normal?
> > > >
> > > > There's no apparent issues with the performance of the cluster but
> this
> > > > write rate seems excessive and I don't know where to look for the
> > > culprit.
> > > >
> > > > The setup is Ceph 16.2.9 running in hyperconverged 3 node core
> cluster
> > > and
> > > > 6 hdd osd nodes.
> > > >
> > > > Here's typical status when pretty much all clients are idling. Most
> of
> > > that
> > > > write bandwidth and maybe fifth of the write iops is hitting the
> > > > metadata pool.
> > > >
> > > >
> > >
> ---------------------------------------------------------------------------------------------------
> > > > root@pve-core-1:~# ceph -s
> > > >   cluster:
> > > >     id:     2088b4b1-8de1-44d4-956e-aa3d3afff77f
> > > >     health: HEALTH_OK
> > > >
> > > >   services:
> > > >     mon: 3 daemons, quorum pve-core-1,pve-core-2,pve-core-3 (age 2w)
> > > >     mgr: pve-core-1(active, since 4w), standbys: pve-core-2,
> pve-core-3
> > > >     mds: 1/1 daemons up, 2 standby
> > > >     osd: 48 osds: 48 up (since 5h), 48 in (since 4M)
> > > >
> > > >   data:
> > > >     volumes: 1/1 healthy
> > > >     pools:   10 pools, 625 pgs
> > > >     objects: 70.06M objects, 46 TiB
> > > >     usage:   95 TiB used, 182 TiB / 278 TiB avail
> > > >     pgs:     625 active+clean
> > > >
> > > >   io:
> > > >     client:   45 KiB/s rd, 38 MiB/s wr, 6 op/s rd, 287 op/s wr
> > > >
> > >
> ---------------------------------------------------------------------------------------------------
> > > >
> > > > Here's some daemonperf dump:
> > > >
> > > >
> > >
> ---------------------------------------------------------------------------------------------------
> > > > root@pve-core-1:~# ceph daemonperf mds.`hostname -s`
> > > >
> > >
> ----------------------------------------mds-----------------------------------------
> > > > --mds_cache--- ------mds_log------ -mds_mem- -------mds_server-------
> > > mds_
> > > > -----objecter------ purg
> > > > req  rlat fwd  inos caps exi  imi  hifc crev cgra ctru cfsa cfa  hcc
> > > hccd
> > > > hccr prcr|stry recy recd|subm evts segs repl|ino  dn  |hcr  hcs
> hsr  cre
> > > >  cat |sess|actv rd   wr   rdwr|purg|
> > > >  40    0    0  767k  78k   0    0    0    1    6    1    0    0
> 5    5
> > > >  3    7 |1.1k   0    0 | 17  3.7k 134    0 |767k 767k| 40    5    0
>   0
> > > >  0 |110 |  4    2   21    0 |  2
> > > >  57    2    0  767k  78k   0    0    0    3   16    3    0    0
>  11   11
> > > >  0   17 |1.1k   0    0 | 45  3.7k 137    0 |767k 767k| 57    8    0
>   0
> > > >  0 |110 |  0    2   28    0 |  4
> > > >  57    4    0  767k  78k   0    0    0    4   34    4    0    0
>  34   33
> > > >  2   26 |1.0k   0    0 |134  3.9k 139    0 |767k 767k| 57   13    0
>   0
> > > >  0 |110 |  0    2  112    0 | 19
> > > >  67    3    0  767k  78k   0    0    0    6   32    6    0    0
>  22   22
> > > >  0   32 |1.1k   0    0 | 78  3.9k 141    0 |767k 768k| 67    4    0
>   0
> > > >  0 |110 |  0    2   56    0 |  2
> > > >
> > >
> ---------------------------------------------------------------------------------------------------
> > > > Any ideas where to look at?
> > >
> > > Check the perf dump output of the mds:
> > >
> > > ceph tell mds.<fs_name>:0 perf dump
> > >
> > > over a period of time to identify what's going on. You can also look
> > > at the objecter_ops (another tell command) for the MDS.
> > >
> > > --
> > > Patrick Donnelly, Ph.D.
> > > He / Him / His
> > > Principal Software Engineer
> > > Red Hat, Inc.
> > > GPG: 19F28A586F808C2402351B93C3301A3E258DD79D
> > >
> > >
> > _______________________________________________
> > ceph-users mailing list -- ceph-users@xxxxxxx
> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
> >
>
>
> --
> Cheers,
> Venky
>
>
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx