OSDs getting OOM-killed right after startup

Mara Sophie Grosch <littlefox@xxxxxxxxxx> · Mon, 6 Jun 2022 18:00:43 +0200

Hi,

I have a currently-down ceph cluster
* v17.2.0 / quay.io/v17.2.0-20220420
* 3 nodes, 4 OSDs
* around 1TiB used/3TiB total
* probably enough resources
  - two of those nodes have 64GiB memory, the third has 16GiB
  - one of the 64GiB nodes runs two OSDs, as it's a physical node with
    2 NVMe drives
* provisioned via Rook and running in my Kubernetes cluster

After some upgrades yesterday (system packages on the nodes) and today
(Kubernetes to latest version), I wanted to reboot my nodes. The drain
of the first node put a lot of stress on the other OSDs, making them go
OOM - but I think that probably is a bug already, as at least one of
those nodes has enough resources (64GiB memory, physical machine, ~40GiB
surely free - but don't have metrics rn as everything is down).

I'm now seeing all OSDs going into OOM right on startup, from what it
looks like everything is fine until right after `load_pgs` - as soon as
it activates some PGs, memory usage increases _a lot_ (from ~4-5GiB
RES before to .. 60GiB, though that depends on the free memory on the
node).

Because of this, I cannot get any of them online again and need advice
what to do and what info might be useful. Logs of one of those OSDs are
here[1] (captured via kubectl logs, so something right from start might
be missing - happy to dig deeper if you need more) and my changed
ceph.conf entries are here[2]. I had `bluefs_buffered_io = false` until
today, changed it to true based on a suggestion in another debug
thread[3]

Any hint is greatly appreciated, many thanks
Mara Grosch

[1] https://pastebin.com/VFczNqUk
[2] https://pastebin.com/QXust5XD
[3] https://lists.ceph.io/hyperkitty/list/ceph-users@xxxxxxx/message/CBPXLPWEVZLZE55WAQSMB7KSIQPV5I76/
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx