It's a 6-node cluster with 96 OSDs, not much I/O, mgr . Each node has 384 GB of RAM, each OSD has a memory target of 16 GB, about 100 GB of memory, give or take, is available (mostly used by page cache) on each node during normal operation. Nothing unusual there, tbh. No unusual mgr modules or settings either, except for disabled progress: { "always_on_modules": [ "balancer", "crash", "devicehealth", "orchestrator", "pg_autoscaler", "progress", "rbd_support", "status", "telemetry", "volumes" ], "enabled_modules": [ "cephadm", "dashboard", "iostat", "prometheus", "restful" ], /Z On Wed, 22 Nov 2023, 14:52 Eugen Block, <eblock@xxxxxx> wrote: > What does your hardware look like memory-wise? Just for comparison, > one customer cluster has 4,5 GB in use (middle-sized cluster for > openstack, 280 OSDs): > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ > COMMAND > 6077 ceph 20 0 6357560 4,522g 22316 S 12,00 1,797 > 57022:54 ceph-mgr > > In our own cluster (smaller than that and not really heavily used) the > mgr uses almost 2 GB. So those numbers you have seem relatively small. > > Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: > > > I've disabled the progress module entirely and will see how it goes. > > Otherwise, mgr memory usage keeps increasing slowly, from past experience > > it will stabilize at around 1.5-1.6 GB. Other than this event warning, > it's > > unclear what could have caused random memory ballooning. > > > > /Z > > > > On Wed, 22 Nov 2023 at 13:07, Eugen Block <eblock@xxxxxx> wrote: > > > >> I see these progress messages all the time, I don't think they cause > >> it, but I might be wrong. You can disable it just to rule that out. > >> > >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: > >> > >> > Unfortunately, I don't have a full stack trace because there's no > crash > >> > when the mgr gets oom-killed. There's just the mgr log, which looks > >> > completely normal until about 2-3 minutes before the oom-kill, when > >> > tmalloc warnings show up. > >> > > >> > I'm not sure that it's the same issue that is described in the > tracker. > >> We > >> > seem to have some stale "events" in the progress module though: > >> > > >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug > 2023-11-21T14:56:30.718+0000 > >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev > >> > cacc4230-75ee-4892-b8fd-a19fec8f9f66 does not exist > >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug > 2023-11-21T14:56:30.718+0000 > >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev > >> > 44824331-3f6b-45c4-b925-423d098c3c76 does not exist > >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug > 2023-11-21T14:56:30.718+0000 > >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev > >> > 0139bc54-ae42-4483-b278-851d77f23f9f does not exist > >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug > 2023-11-21T14:56:30.718+0000 > >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev > >> > f9d6c20e-b8d8-4625-b9cf-84da1244c822 does not exist > >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug > 2023-11-21T14:56:30.718+0000 > >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev > >> > 1486b26d-2a23-4416-a864-2cbb0ecf1429 does not exist > >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug > 2023-11-21T14:56:30.718+0000 > >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev > >> > 7f14d01c-498c-413f-b2ef-05521050190a does not exist > >> > Nov 21 14:57:35 ceph01 bash[3941523]: debug > 2023-11-21T14:57:35.950+0000 > >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev > >> > 48cbd97f-82f7-4b80-8086-890fff6e0824 does not exist > >> > > >> > I tried clearing them but they keep showing up. I am wondering if > these > >> > missing events can cause memory leaks over time. > >> > > >> > /Z > >> > > >> > On Wed, 22 Nov 2023 at 11:12, Eugen Block <eblock@xxxxxx> wrote: > >> > > >> >> Do you have the full stack trace? The pastebin only contains the > >> >> "tcmalloc: large alloc" messages (same as in the tracker issue). > Maybe > >> >> comment in the tracker issue directly since Radek asked for someone > >> >> with a similar problem in a newer release. > >> >> > >> >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: > >> >> > >> >> > Thanks, Eugen. It is similar in the sense that the mgr is getting > >> >> > OOM-killed. > >> >> > > >> >> > It started happening in our cluster after the upgrade to 16.2.14. > We > >> >> > haven't had this issue with earlier Pacific releases. > >> >> > > >> >> > /Z > >> >> > > >> >> > On Tue, 21 Nov 2023, 21:53 Eugen Block, <eblock@xxxxxx> wrote: > >> >> > > >> >> >> Just checking it on the phone, but isn’t this quite similar? > >> >> >> > >> >> >> https://tracker.ceph.com/issues/45136 > >> >> >> > >> >> >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: > >> >> >> > >> >> >> > Hi, > >> >> >> > > >> >> >> > I'm facing a rather new issue with our Ceph cluster: from time > to > >> time > >> >> >> > ceph-mgr on one of the two mgr nodes gets oom-killed after > >> consuming > >> >> over > >> >> >> > 100 GB RAM: > >> >> >> > > >> >> >> > [Nov21 15:02] tp_osd_tp invoked oom-killer: > >> >> >> > gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, > oom_score_adj=0 > >> >> >> > [ +0.000010] oom_kill_process.cold+0xb/0x10 > >> >> >> > [ +0.000002] [ pid ] uid tgid total_vm rss > >> pgtables_bytes > >> >> >> > swapents oom_score_adj name > >> >> >> > [ +0.000008] > >> >> >> > > >> >> >> > >> >> > >> > oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=504d37b566d9fd442d45904a00584b4f61c93c5d49dc59eb1c948b3d1c096907,mems_allowed=0-1,global_oom,task_memcg=/docker/3826be8f9115479117ddb8b721ca57585b2bdd58a27c7ed7b38e8d83eb795957,task=ceph-mgr,pid=3941610,uid=167 > >> >> >> > [ +0.000697] Out of memory: Killed process 3941610 (ceph-mgr) > >> >> >> > total-vm:146986656kB, anon-rss:125340436kB, file-rss:0kB, > >> >> shmem-rss:0kB, > >> >> >> > UID:167 pgtables:260356kB oom_score_adj:0 > >> >> >> > [ +6.509769] oom_reaper: reaped process 3941610 (ceph-mgr), now > >> >> >> > anon-rss:0kB, file-rss:0kB, shmem-rss:0kB > >> >> >> > > >> >> >> > The cluster is stable and operating normally, there's nothing > >> unusual > >> >> >> going > >> >> >> > on before, during or after the kill, thus it's unclear what > causes > >> the > >> >> >> mgr > >> >> >> > to balloon, use all RAM and get killed. Systemd logs aren't very > >> >> helpful: > >> >> >> > they just show normal mgr operations until it fails to allocate > >> memory > >> >> >> and > >> >> >> > gets killed: https://pastebin.com/MLyw9iVi > >> >> >> > > >> >> >> > The mgr experienced this issue several times in the last 2 > months, > >> and > >> >> >> the > >> >> >> > events don't appear to correlate with any other events in the > >> cluster > >> >> >> > because basically nothing else happened at around those times. > How > >> >> can I > >> >> >> > investigate this and figure out what's causing the mgr to > consume > >> all > >> >> >> > memory and get killed? > >> >> >> > > >> >> >> > I would very much appreciate any advice! > >> >> >> > > >> >> >> > Best regards, > >> >> >> > Zakhar > >> >> >> > _______________________________________________ > >> >> >> > ceph-users mailing list -- ceph-users@xxxxxxx > >> >> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> >> >> > >> >> >> > >> >> >> _______________________________________________ > >> >> >> ceph-users mailing list -- ceph-users@xxxxxxx > >> >> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx > >> >> >> > >> >> > >> >> > >> >> > >> >> > >> > >> > >> > >> > > > > _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx