Ceph OSD imbalance and performance

Dave Ingram <dave@xxxxxxxxxxxx> · Tue, 28 Feb 2023 11:12:26 -0600

Hello,

Our ceph cluster performance has become horrifically slow over the past few
months.

Nobody here is terribly familiar with ceph and we're inheriting this
cluster without much direction.

Architecture: 40Gbps QDR IB fabric between all ceph nodes and our ovirt VM
hosts. 11 OSD nodes with a total of 163 OSDs. 14 pools, 3616 PGs, 1.19PB
total capacity.

Ceph versions:

{
  "mon": {
    "ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee)
luminous (stable)": 3
  },
  "mgr": {
    "ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee)
luminous (stable)": 3
  },
  "osd": {
    "ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee)
luminous (stable)": 118,
    "ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 22,
    "ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e)
luminous (stable)": 19
  },
  "mds": {},
  "overall": {
    "ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee)
luminous (stable)": 124,
    "ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777)
luminous (stable)": 22,
    "ceph version 12.2.13 (584a20eb0237c657dc0567da126be145106aa47e)
luminous (stable)": 19
  }
}

The majority of disks are spindles but there are also NVMe SSDs. There is a
lot of variability in drive sizes - two different sets of admins added
disks sized between 6TB and 16TB and I suspect this and imbalanced
weighting is to blame.

Performance on the ovirt VMs can dip as low as several *kilobytes*
per-second (!) on reads and a few MB/sec on writes. There are also several
scrub errors. In short, it's a complete wreck.

STATUS:

[root@ceph-admin davei]# ceph -s
  cluster:
    id:     1b8d958c-e50b-40ef-a681-16cfeb9390b8
    health: HEALTH_ERR
            3 scrub errors
            Possible data damage: 3 pgs inconsistent

  services:
    mon: 3 daemons, quorum ceph1,ceph2,ceph3
    mgr: ceph3(active), standbys: ceph2, ceph1
    osd: 163 osds: 159 up, 158 in

  data:
    pools:   14 pools, 3616 pgs
    objects: 46.28M objects, 174TiB
    usage:   527TiB used, 694TiB / 1.19PiB avail
    pgs:     3609 active+clean
             4    active+clean+scrubbing+deep
             3    active+clean+inconsistent

  io:
    client:   74.3MiB/s rd, 96.0MiB/s wr, 3.85kop/s rd, 3.68kop/s wr

---
HEALTH:

[root@ceph-admin davei]# ceph health detail
HEALTH_ERR 3 scrub errors; Possible data damage: 3 pgs inconsistent
OSD_SCRUB_ERRORS 3 scrub errors
PG_DAMAGED Possible data damage: 3 pgs inconsistent
    pg 2.8a is active+clean+inconsistent, acting [13,152,127]
    pg 2.ce is active+clean+inconsistent, acting [145,13,152]
    pg 2.e8 is active+clean+inconsistent, acting [150,162,42]
---
CEPH OSD DF:

(not going to paste that all in here): https://pastebin.com/CNW5RKWx

What else am I missing in terms of what to share with you all?

Any advice on how we should 'reweight' these to get the performance to
improve?

Thanks all,
-Dave
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx