Re: CephFS constant high write I/O to the metadata pool

Patrick Donnelly <pdonnell@xxxxxxxxxx> · Fri, 14 Oct 2022 13:32:13 -0400

Hello Olli,

On Thu, Oct 13, 2022 at 5:01 AM Olli Rajala <olli.rajala@xxxxxxxx> wrote:
>
> Hi,
>
> I'm seeing constant 25-50MB/s writes to the metadata pool even when all
> clients and the cluster is idling and in clean state. This surely can't be
> normal?
>
> There's no apparent issues with the performance of the cluster but this
> write rate seems excessive and I don't know where to look for the culprit.
>
> The setup is Ceph 16.2.9 running in hyperconverged 3 node core cluster and
> 6 hdd osd nodes.
>
> Here's typical status when pretty much all clients are idling. Most of that
> write bandwidth and maybe fifth of the write iops is hitting the
> metadata pool.
>
> ---------------------------------------------------------------------------------------------------
> root@pve-core-1:~# ceph -s
>   cluster:
>     id:     2088b4b1-8de1-44d4-956e-aa3d3afff77f
>     health: HEALTH_OK
>
>   services:
>     mon: 3 daemons, quorum pve-core-1,pve-core-2,pve-core-3 (age 2w)
>     mgr: pve-core-1(active, since 4w), standbys: pve-core-2, pve-core-3
>     mds: 1/1 daemons up, 2 standby
>     osd: 48 osds: 48 up (since 5h), 48 in (since 4M)
>
>   data:
>     volumes: 1/1 healthy
>     pools:   10 pools, 625 pgs
>     objects: 70.06M objects, 46 TiB
>     usage:   95 TiB used, 182 TiB / 278 TiB avail
>     pgs:     625 active+clean
>
>   io:
>     client:   45 KiB/s rd, 38 MiB/s wr, 6 op/s rd, 287 op/s wr
> ---------------------------------------------------------------------------------------------------
>
> Here's some daemonperf dump:
>
> ---------------------------------------------------------------------------------------------------
> root@pve-core-1:~# ceph daemonperf mds.`hostname -s`
> ----------------------------------------mds-----------------------------------------
> --mds_cache--- ------mds_log------ -mds_mem- -------mds_server------- mds_
> -----objecter------ purg
> req  rlat fwd  inos caps exi  imi  hifc crev cgra ctru cfsa cfa  hcc  hccd
> hccr prcr|stry recy recd|subm evts segs repl|ino  dn  |hcr  hcs  hsr  cre
>  cat |sess|actv rd   wr   rdwr|purg|
>  40    0    0  767k  78k   0    0    0    1    6    1    0    0    5    5
>  3    7 |1.1k   0    0 | 17  3.7k 134    0 |767k 767k| 40    5    0    0
>  0 |110 |  4    2   21    0 |  2
>  57    2    0  767k  78k   0    0    0    3   16    3    0    0   11   11
>  0   17 |1.1k   0    0 | 45  3.7k 137    0 |767k 767k| 57    8    0    0
>  0 |110 |  0    2   28    0 |  4
>  57    4    0  767k  78k   0    0    0    4   34    4    0    0   34   33
>  2   26 |1.0k   0    0 |134  3.9k 139    0 |767k 767k| 57   13    0    0
>  0 |110 |  0    2  112    0 | 19
>  67    3    0  767k  78k   0    0    0    6   32    6    0    0   22   22
>  0   32 |1.1k   0    0 | 78  3.9k 141    0 |767k 768k| 67    4    0    0
>  0 |110 |  0    2   56    0 |  2
> ---------------------------------------------------------------------------------------------------
> Any ideas where to look at?

Check the perf dump output of the mds:

ceph tell mds.<fs_name>:0 perf dump

over a period of time to identify what's going on. You can also look
at the objecter_ops (another tell command) for the MDS.

-- 
Patrick Donnelly, Ph.D.
He / Him / His
Principal Software Engineer
Red Hat, Inc.
GPG: 19F28A586F808C2402351B93C3301A3E258DD79D

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx