Re: Please help collecting stats of Ceph monitor disk writes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Here is some data from a small, very lightly loaded cluster. It is manually deployed on debian11, with the mon store on an SSD:

1) iotop results:

    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
   1923 be/4 ceph          0.00 B    104.00 K  ?unavailable?  ceph-mon -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph [log]
   1951 be/4 ceph          0.00 B    264.82 M  ?unavailable?  ceph-mon -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph [rocksdb:low0]
   1952 be/4 ceph          0.00 B      7.25 M  ?unavailable?  ceph-mon -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph [rocksdb:high0]
   2148 be/4 ceph          0.00 B      6.51 M  ?unavailable?  ceph-mon -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph [fn_monstore]
   2155 be/4 ceph          0.00 B      3.62 M  ?unavailable?  ceph-mon -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph [safe_timer]
   2160 be/4 ceph          0.00 B    248.00 K  ?unavailable?  ceph-mon -f --cluster ceph --id mon1 --setuser ceph --setgroup ceph [ms_dispatch]

2) manual compactions:

Fri 13 Oct 11:23:48 BST 2023
3780

3) monitor store.db size:

136M	/var/lib/ceph/mon/ceph-mon1/store.db/

4) cluster version and status:

root@mon1:~# ceph version; ceph -s
ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
  cluster:
    id:     9208361c-5b68-41ed-8155-cc246a3fe538
    health: HEALTH_OK
services:
    mon: 3 daemons, quorum mon1,mon2,mon3 (age 5d)
    mgr: mon3(active, since 5d), standbys: mon1, mon2
    mds: 1/1 daemons up, 2 standby
    osd: 18 osds: 18 up (since 5d), 18 in (since 5d)
    rgw: 3 daemons active (3 hosts, 1 zones)
data:
    volumes: 1/1 healthy
    pools:   14 pools, 745 pgs
    objects: 3.32M objects, 7.2 TiB
    usage:   21 TiB used, 24 TiB / 46 TiB avail
    pgs:     745 active+clean
io:
    client:   0 B/s rd, 432 KiB/s wr, 0 op/s rd, 5 op/s wr



On 13/10/2023 07:58, Zakhar Kirpichenko wrote:
Hi!

Further to my thread "Ceph 16.2.x mon compactions, disk writes" (
https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/thread/XGCI2LFW5RH3GUOQFJ542ISCSZH3FRX2/)
where we have established that Ceph monitors indeed write considerable
amounts of data to disks, I would like to request fellow Ceph users to
provide feedback and help gather some statistics regarding whether this
happens on all clusters or on some specific subset of clusters.

The procedure is rather simple and won't take much of your time.

If you are willing to help, please follow this procedure:

---------

1. Install iotop and run the following command on any of your monitor nodes:

iotop -ao -bn 2 -d 300 2>&1 | grep -E "TID|ceph-mon"

This will collect a 5-minute disk I/O statistics and produce an output
containing the stats for Ceph monitor threads running on the node:

     TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
     TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
    4854 be/4 167           8.62 M      2.27 G  0.00 %  0.72 % ceph-mon -n
mon.ceph04 -f --setuser ceph --setgroup ceph --default-log-to-file=false
--default-log-to-stderr=true --default-log-stderr-prefix=debug
  --default-mon-cluster-log-to-file=false
--default-mon-cluster-log-to-stderr=true [rocksdb:low0]
    4919 be/4 167           0.00 B     39.43 M  0.00 %  0.02 % ceph-mon -n
mon.ceph04 -f --setuser ceph --setgroup ceph --default-log-to-file=false
--default-log-to-stderr=true --default-log-stderr-prefix=debug
  --default-mon-cluster-log-to-file=false
--default-mon-cluster-log-to-stderr=true [ms_dispatch]
    4855 be/4 167           8.00 K     19.55 M  0.00 %  0.00 % ceph-mon -n
mon.ceph04 -f --setuser ceph --setgroup ceph --default-log-to-file=false
--default-log-to-stderr=true --default-log-stderr-prefix=debug
  --default-mon-cluster-log-to-file=false
--default-mon-cluster-log-to-stderr=true [rocksdb:high0]

We're particularly interested in the amount of written data.

---------

2. Optional: collect the number of "manual compaction" events from the
monitor.

This step will depend on how your monitor runs. My cluster is managed by
cephadm and monitors run in docker containers, thus I can do something like
this, where MYMONCONTAINERID is the container ID of Ceph monitor:

# date; d=$(date +'%Y-%m-%d'); docker logs MYMONCONTAINERID 2>&1 | grep $d
| grep -ci "manual compaction from"
Fri 13 Oct 2023 06:29:39 AM UTC
580

Alternatively, I could run the command against the log file MYMONLOGFILE,
whose location I obtained with docker inspect:

# date; d=$(date +'%Y-%m-%d'); grep $d MYMONLOGFILE | grep -ci "manual
compaction from"
Fri 13 Oct 2023 06:35:27 AM UTC
588

If you run monitors with podman or without containerization, please get
this information the way that is most convenient in your setup.

---------

3. Optional: collect the monitor store.db size.

Usually the monitor store.db is available at
/var/lib/ceph/FSID/mon.NAME/store.db/, for example:

# du -hs
/var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/mon.ceph04/store.db/
642M
  /var/lib/ceph/3f50555a-ae2a-11eb-a2fc-ffde44714d86/mon.ceph04/store.db/

---------

4. Optional: collect Ceph cluster version and status.

For example:

root@ceph01:/# ceph version; ceph -s
ceph version 16.2.14 (238ba602515df21ea7ffc75c88db29f9e5ef12c9) pacific
(stable)
   cluster:
     id:     3f50555a-ae2a-11eb-a2fc-ffde44714d86
     health: HEALTH_OK

   services:
     mon: 5 daemons, quorum ceph01,ceph03,ceph04,ceph05,ceph02 (age 2w)
     mgr: ceph01.vankui(active, since 13d), standbys: ceph02.shsinf
     osd: 96 osds: 96 up (since 2w), 95 in (since 3w)

   data:
     pools:   10 pools, 2400 pgs
     objects: 6.30M objects, 16 TiB
     usage:   61 TiB used, 716 TiB / 777 TiB avail
     pgs:     2396 active+clean
              3    active+clean+scrubbing+deep
              1    active+clean+scrubbing

   io:
     client:   71 MiB/s rd, 60 MiB/s wr, 2.94k op/s rd, 2.56k op/s wr

---------

5. Reply to this thread and submit the collected information.

For example:

1) iotop results:
... Paste data obtained in step 1)

2) manual compactions:
... Paste data obtained in step 2), or put "N/A"

3) monitor store.db size:
... Paste data obtained in step 3), or put "N/A"

4) cluster version and status:
... Paste data obtained in step 4), or put "N/A"

-------------

I would very much appreciate your effort and help with gathering these
stats. Please don't hesitate to contact me with any questions or concerns.

Best regards,

Zakhar
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx



[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux