Re: OSD META usage growing without bounds

Frank Schilder <frans@xxxxxx> · Tue, 11 Jan 2022 11:45:11 +0000

Hi Igor,

thanks for your reply. To avoid further OSD fails, I shut down the cluster yesterday. Unfortunately, after restart all OSDs trimmed whatever was filling them up:

[root@rit-tceph ~]# ceph osd df tree
ID CLASS WEIGHT  REWEIGHT SIZE    USE     DATA    OMAP   META     AVAIL   %USE VAR  PGS TYPE NAME
-1       2.44707        - 2.4 TiB 9.2 GiB 255 MiB 25 KiB  9.0 GiB 2.4 TiB 0.37 1.00   - root default
-5       0.81569        - 835 GiB 3.1 GiB  85 MiB 19 KiB  3.0 GiB 832 GiB 0.37 1.00   -     host tceph-01
 0   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB 19 KiB 1024 MiB 277 GiB 0.37 1.00 169         osd.0
 3   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB    0 B    1 GiB 277 GiB 0.37 1.00 164         osd.3
 8   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB    0 B    1 GiB 277 GiB 0.37 1.00 167         osd.8
-3       0.81569        - 835 GiB 3.1 GiB  85 MiB  3 KiB  3.0 GiB 832 GiB 0.37 1.00   -     host tceph-02
 2   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB    0 B    1 GiB 277 GiB 0.37 1.00 157         osd.2
 4   hdd 0.27190  1.00000 278 GiB 1.0 GiB  29 MiB    0 B    1 GiB 277 GiB 0.37 1.00 172         osd.4
 6   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB  3 KiB 1024 MiB 277 GiB 0.37 1.00 171         osd.6
-7       0.81569        - 835 GiB 3.1 GiB  85 MiB  3 KiB  3.0 GiB 832 GiB 0.37 1.00   -     host tceph-03
 1   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB    0 B    1 GiB 277 GiB 0.37 1.00 171         osd.1
 5   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB  3 KiB 1024 MiB 277 GiB 0.37 1.00 160         osd.5
 7   hdd 0.27190  1.00000 278 GiB 1.0 GiB  28 MiB    0 B    1 GiB 277 GiB 0.37 1.00 169         osd.7
                    TOTAL 2.4 TiB 9.2 GiB 255 MiB 25 KiB  9.0 GiB 2.4 TiB 0.37
MIN/MAX VAR: 1.00/1.00  STDDEV: 0

The OSDs didn't log what they were doing on startup. The log goes straight from bluefs init to PG scrub messages (with a very long wait time in between). Iostat showed very heavy read on the drives during the trimming/boot phase.

I'm not sure if it helps to collect perf counters already now. I will wait until I see some unusual growth in META again. I don't think the problem is there from the start, it looked more like the OSDs started filling meta up independently at different times. I will let the cluster sit idle as before and keep watching. Hope I find something.

Thanks and best regards,
=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: Igor Fedotov <igor.fedotov@xxxxxxxx>
Sent: 11 January 2022 10:27:14
To: Frank Schilder; ceph-users
Subject: Re:  OSD META usage growing without bounds

Hi Frank,

you might want to collect a couple of perf dumps for osd in question in
e.g. one hour interval. And inspect what counters are growing in bluefs
sections. "log_bytes" is of particular interest...

Thanks,

Igor

On 1/10/2022 2:25 PM, Frank Schilder wrote:
> Hi, I'm observing a strange behaviour on a small test cluster (13.2.10 (564bdc4ae87418a232fc901524470e1a0f76d641) mimic (stable)). The cluster is up for about half a year and almost empty. We did a few rbd bench runs and created a file system, but there was zero client IO for at least 3 months. It looks like recently the OSD META usage of some OSDs started to increase for no apparent reason. One OSD already died with 100% usage and another is on its way. I can't see any obvious reason for this strange behaviour.
>
> If anyone has an idea, please let me know.
>
> Some diagnostic output:
>
> [root@rit-tceph ~]# ceph status
>    cluster:
>      id:     bf1f51f5-b381-4cf7-b3db-88d044c1960c
>      health: HEALTH_WARN
>              1 nearfull osd(s)
>              3 pool(s) nearfull
>
>    services:
>      mon: 3 daemons, quorum tceph-01,tceph-02,tceph-03
>      mgr: tceph-01(active), standbys: tceph-02, tceph-03
>      mds: testfs-1/1/1 up  {0=tceph-01=up:active}, 2 up:standby
>      osd: 9 osds: 8 up, 8 in
>
>    data:
>      pools:   3 pools, 500 pgs
>      objects: 24  objects, 2.3 KiB
>      usage:   746 GiB used, 1.4 TiB / 2.2 TiB avail
>      pgs:     500 active+clean
>
> [root@rit-tceph ~]# ceph df
> GLOBAL:
>      SIZE        AVAIL       RAW USED     %RAW USED
>      2.2 TiB     1.4 TiB      746 GiB         33.49
> POOLS:
>      NAME                ID     USED        %USED     MAX AVAIL     OBJECTS
>      test                1         19 B         0        81 GiB           2
>      testfs_data         2          0 B         0        81 GiB           0
>      testfs_metadata     3      2.2 KiB         0        81 GiB          22
>
> [root@rit-tceph ~]# ceph osd df tree
> ID CLASS WEIGHT  REWEIGHT SIZE    USE     DATA    OMAP   META    AVAIL   %USE  VAR  PGS TYPE NAME
> -1       2.44707        - 2.2 TiB 746 GiB 120 MiB 34 KiB 746 GiB 1.4 TiB 33.49 1.00   - root default
> -5       0.81569        - 557 GiB 195 GiB  30 MiB  3 KiB 195 GiB 362 GiB 35.04 1.05   -     host tceph-01
>   0   hdd 0.27190  1.00000 278 GiB  38 GiB  15 MiB  3 KiB  38 GiB 241 GiB 13.61 0.41 260         osd.0
>   3   hdd 0.27190        0     0 B     0 B     0 B    0 B     0 B     0 B     0    0   0         osd.3
>   8   hdd 0.27190  1.00000 278 GiB 157 GiB  15 MiB    0 B 157 GiB 121 GiB 56.47 1.69 240         osd.8
> -3       0.81569        - 835 GiB 113 GiB  45 MiB  3 KiB 113 GiB 723 GiB 13.48 0.40   -     host tceph-02
>   2   hdd 0.27190  1.00000 278 GiB  18 GiB  15 MiB    0 B  18 GiB 261 GiB  6.30 0.19 157         osd.2
>   4   hdd 0.27190  1.00000 278 GiB  48 GiB  15 MiB    0 B  48 GiB 231 GiB 17.21 0.51 172         osd.4
>   6   hdd 0.27190  1.00000 278 GiB  47 GiB  15 MiB  3 KiB  47 GiB 231 GiB 16.93 0.51 171         osd.6
> -7       0.81569        - 835 GiB 438 GiB  45 MiB 28 KiB 438 GiB 397 GiB 52.48 1.57   -     host tceph-03
>   1   hdd 0.27190  1.00000 278 GiB 238 GiB  15 MiB 25 KiB 238 GiB  41 GiB 85.35 2.55 171         osd.1
>   5   hdd 0.27190  1.00000 278 GiB 200 GiB  15 MiB  3 KiB 200 GiB  79 GiB 71.68 2.14 160         osd.5
>   7   hdd 0.27190  1.00000 278 GiB 1.1 GiB  15 MiB    0 B 1.1 GiB 277 GiB  0.40 0.01 169         osd.7
>                      TOTAL 2.2 TiB 746 GiB 120 MiB 34 KiB 746 GiB 1.4 TiB 33.49
> MIN/MAX VAR: 0.01/2.55  STDDEV: 30.50
>
> 2 hours later:
>
> [root@rit-tceph ~]# ceph status
>    cluster:
>      id:     bf1f51f5-b381-4cf7-b3db-88d044c1960c
>      health: HEALTH_WARN
>              1 nearfull osd(s)
>              3 pool(s) nearfull
>
>    services:
>      mon: 3 daemons, quorum tceph-01,tceph-02,tceph-03
>      mgr: tceph-01(active), standbys: tceph-02, tceph-03
>      mds: testfs-1/1/1 up  {0=tceph-01=up:active}, 2 up:standby
>      osd: 9 osds: 8 up, 8 in
>
>    data:
>      pools:   3 pools, 500 pgs
>      objects: 24  objects, 2.3 KiB
>      usage:   748 GiB used, 1.4 TiB / 2.2 TiB avail
>      pgs:     500 active+clean
>
> The usage is increasing surprisingly fast.
>
> Thanks for any pointers!
> =================
> Frank Schilder
> AIT Risø Campus
> Bygning 109, rum S14
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx

--
Igor Fedotov
Ceph Lead Developer

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH, Freseniusstr. 31h, 81247 Munich
CEO: Martin Verges - VAT-ID: DE310638492
Com. register: Amtsgericht Munich HRB 231263
Web: https://croit.io | YouTube: https://goo.gl/PGE1Bx

_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx