Hi, Today after 3 weeks of normal operation the mgr reached memory usage of 1600 MB, quickly ballooned to over 100 GB for no apparent reason and got oom-killed again. There were no suspicious messages in the logs until the message indicating that the mgr failed to allocate more memory. Any thoughts? /Z On Mon, 11 Dec 2023 at 12:34, Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote: > Hi, > > Another update: after 2 more weeks the mgr process grew to ~1.5 GB, which > again was expected: > > mgr.ceph01.vankui ceph01 *:8443,9283 running (2w) 102s ago 2y > 1519M - 16.2.14 fc0182d6cda5 3451f8c6c07e > mgr.ceph02.shsinf ceph02 *:8443,9283 running (2w) 102s ago 7M > 112M - 16.2.14 fc0182d6cda5 1c3d2d83b6df > > The cluster is healthy and operating normally, the mgr process is growing > slowly. It's still unclear what caused the ballooning and OOM issue under > very similar conditions. > > /Z > > On Sat, 25 Nov 2023 at 08:31, Zakhar Kirpichenko <zakhar@xxxxxxxxx> wrote: > >> Hi, >> >> A small update: after disabling 'progress' module the active mgr (on >> ceph01) used up ~1.3 GB of memory in 3 days, which was expected: >> >> mgr.ceph01.vankui ceph01 *:8443,9283 running (3d) 9m ago 2y >> 1284M - 16.2.14 fc0182d6cda5 3451f8c6c07e >> mgr.ceph02.shsinf ceph02 *:8443,9283 running (3d) 9m ago 7M >> 374M - 16.2.14 fc0182d6cda5 1c3d2d83b6df >> >> The cluster is healthy and operating normally. The mgr process is growing >> slowly, at roughly about 1-2 MB per 10 minutes give or take, which is not >> quick enough to balloon to over 100 GB RSS over several days, which likely >> means that whatever triggers the issue happens randomly and quite suddenly. >> I'll continue monitoring the mgr and get back with more observations. >> >> /Z >> >> On Wed, 22 Nov 2023 at 16:33, Zakhar Kirpichenko <zakhar@xxxxxxxxx> >> wrote: >> >>> Thanks for this. This looks similar to what we're observing. Although we >>> don't use the API apart from the usage by Ceph deployment itself - which I >>> guess still counts. >>> >>> /Z >>> >>> On Wed, 22 Nov 2023, 15:22 Adrien Georget, <adrien.georget@xxxxxxxxxxx> >>> wrote: >>> >>>> Hi, >>>> >>>> This memory leak with ceph-mgr seems to be due to a change in Ceph >>>> 16.2.12. >>>> Check this issue : https://tracker.ceph.com/issues/59580 >>>> We are also affected by this, with or without containerized services. >>>> >>>> Cheers, >>>> Adrien >>>> >>>> Le 22/11/2023 à 14:14, Eugen Block a écrit : >>>> > One other difference is you use docker, right? We use podman, could >>>> it >>>> > be some docker restriction? >>>> > >>>> > Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: >>>> > >>>> >> It's a 6-node cluster with 96 OSDs, not much I/O, mgr . Each node >>>> has >>>> >> 384 >>>> >> GB of RAM, each OSD has a memory target of 16 GB, about 100 GB of >>>> >> memory, >>>> >> give or take, is available (mostly used by page cache) on each node >>>> >> during >>>> >> normal operation. Nothing unusual there, tbh. >>>> >> >>>> >> No unusual mgr modules or settings either, except for disabled >>>> progress: >>>> >> >>>> >> { >>>> >> "always_on_modules": [ >>>> >> "balancer", >>>> >> "crash", >>>> >> "devicehealth", >>>> >> "orchestrator", >>>> >> "pg_autoscaler", >>>> >> "progress", >>>> >> "rbd_support", >>>> >> "status", >>>> >> "telemetry", >>>> >> "volumes" >>>> >> ], >>>> >> "enabled_modules": [ >>>> >> "cephadm", >>>> >> "dashboard", >>>> >> "iostat", >>>> >> "prometheus", >>>> >> "restful" >>>> >> ], >>>> >> >>>> >> /Z >>>> >> >>>> >> On Wed, 22 Nov 2023, 14:52 Eugen Block, <eblock@xxxxxx> wrote: >>>> >> >>>> >>> What does your hardware look like memory-wise? Just for comparison, >>>> >>> one customer cluster has 4,5 GB in use (middle-sized cluster for >>>> >>> openstack, 280 OSDs): >>>> >>> >>>> >>> PID USER PR NI VIRT RES SHR S %CPU %MEM >>>> TIME+ >>>> >>> COMMAND >>>> >>> 6077 ceph 20 0 6357560 4,522g 22316 S 12,00 1,797 >>>> >>> 57022:54 ceph-mgr >>>> >>> >>>> >>> In our own cluster (smaller than that and not really heavily used) >>>> the >>>> >>> mgr uses almost 2 GB. So those numbers you have seem relatively >>>> small. >>>> >>> >>>> >>> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: >>>> >>> >>>> >>> > I've disabled the progress module entirely and will see how it >>>> goes. >>>> >>> > Otherwise, mgr memory usage keeps increasing slowly, from past >>>> >>> experience >>>> >>> > it will stabilize at around 1.5-1.6 GB. Other than this event >>>> >>> warning, >>>> >>> it's >>>> >>> > unclear what could have caused random memory ballooning. >>>> >>> > >>>> >>> > /Z >>>> >>> > >>>> >>> > On Wed, 22 Nov 2023 at 13:07, Eugen Block <eblock@xxxxxx> wrote: >>>> >>> > >>>> >>> >> I see these progress messages all the time, I don't think they >>>> cause >>>> >>> >> it, but I might be wrong. You can disable it just to rule that >>>> out. >>>> >>> >> >>>> >>> >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: >>>> >>> >> >>>> >>> >> > Unfortunately, I don't have a full stack trace because there's >>>> no >>>> >>> crash >>>> >>> >> > when the mgr gets oom-killed. There's just the mgr log, which >>>> >>> looks >>>> >>> >> > completely normal until about 2-3 minutes before the oom-kill, >>>> >>> when >>>> >>> >> > tmalloc warnings show up. >>>> >>> >> > >>>> >>> >> > I'm not sure that it's the same issue that is described in the >>>> >>> tracker. >>>> >>> >> We >>>> >>> >> > seem to have some stale "events" in the progress module though: >>>> >>> >> > >>>> >>> >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug >>>> >>> 2023-11-21T14:56:30.718+0000 >>>> >>> >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev >>>> >>> >> > cacc4230-75ee-4892-b8fd-a19fec8f9f66 does not exist >>>> >>> >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug >>>> >>> 2023-11-21T14:56:30.718+0000 >>>> >>> >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev >>>> >>> >> > 44824331-3f6b-45c4-b925-423d098c3c76 does not exist >>>> >>> >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug >>>> >>> 2023-11-21T14:56:30.718+0000 >>>> >>> >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev >>>> >>> >> > 0139bc54-ae42-4483-b278-851d77f23f9f does not exist >>>> >>> >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug >>>> >>> 2023-11-21T14:56:30.718+0000 >>>> >>> >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev >>>> >>> >> > f9d6c20e-b8d8-4625-b9cf-84da1244c822 does not exist >>>> >>> >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug >>>> >>> 2023-11-21T14:56:30.718+0000 >>>> >>> >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev >>>> >>> >> > 1486b26d-2a23-4416-a864-2cbb0ecf1429 does not exist >>>> >>> >> > Nov 21 14:56:30 ceph01 bash[3941523]: debug >>>> >>> 2023-11-21T14:56:30.718+0000 >>>> >>> >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev >>>> >>> >> > 7f14d01c-498c-413f-b2ef-05521050190a does not exist >>>> >>> >> > Nov 21 14:57:35 ceph01 bash[3941523]: debug >>>> >>> 2023-11-21T14:57:35.950+0000 >>>> >>> >> > 7f4bb19ef700 0 [progress WARNING root] complete: ev >>>> >>> >> > 48cbd97f-82f7-4b80-8086-890fff6e0824 does not exist >>>> >>> >> > >>>> >>> >> > I tried clearing them but they keep showing up. I am wondering >>>> if >>>> >>> these >>>> >>> >> > missing events can cause memory leaks over time. >>>> >>> >> > >>>> >>> >> > /Z >>>> >>> >> > >>>> >>> >> > On Wed, 22 Nov 2023 at 11:12, Eugen Block <eblock@xxxxxx> >>>> wrote: >>>> >>> >> > >>>> >>> >> >> Do you have the full stack trace? The pastebin only contains >>>> the >>>> >>> >> >> "tcmalloc: large alloc" messages (same as in the tracker >>>> issue). >>>> >>> Maybe >>>> >>> >> >> comment in the tracker issue directly since Radek asked for >>>> >>> someone >>>> >>> >> >> with a similar problem in a newer release. >>>> >>> >> >> >>>> >>> >> >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: >>>> >>> >> >> >>>> >>> >> >> > Thanks, Eugen. It is similar in the sense that the mgr is >>>> >>> getting >>>> >>> >> >> > OOM-killed. >>>> >>> >> >> > >>>> >>> >> >> > It started happening in our cluster after the upgrade to >>>> >>> 16.2.14. >>>> >>> We >>>> >>> >> >> > haven't had this issue with earlier Pacific releases. >>>> >>> >> >> > >>>> >>> >> >> > /Z >>>> >>> >> >> > >>>> >>> >> >> > On Tue, 21 Nov 2023, 21:53 Eugen Block, <eblock@xxxxxx> >>>> wrote: >>>> >>> >> >> > >>>> >>> >> >> >> Just checking it on the phone, but isn’t this quite >>>> similar? >>>> >>> >> >> >> >>>> >>> >> >> >> https://tracker.ceph.com/issues/45136 >>>> >>> >> >> >> >>>> >>> >> >> >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>: >>>> >>> >> >> >> >>>> >>> >> >> >> > Hi, >>>> >>> >> >> >> > >>>> >>> >> >> >> > I'm facing a rather new issue with our Ceph cluster: >>>> from >>>> >>> time >>>> >>> to >>>> >>> >> time >>>> >>> >> >> >> > ceph-mgr on one of the two mgr nodes gets oom-killed >>>> after >>>> >>> >> consuming >>>> >>> >> >> over >>>> >>> >> >> >> > 100 GB RAM: >>>> >>> >> >> >> > >>>> >>> >> >> >> > [Nov21 15:02] tp_osd_tp invoked oom-killer: >>>> >>> >> >> >> > gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0, >>>> >>> oom_score_adj=0 >>>> >>> >> >> >> > [ +0.000010] oom_kill_process.cold+0xb/0x10 >>>> >>> >> >> >> > [ +0.000002] [ pid ] uid tgid total_vm rss >>>> >>> >> pgtables_bytes >>>> >>> >> >> >> > swapents oom_score_adj name >>>> >>> >> >> >> > [ +0.000008] >>>> >>> >> >> >> > >>>> >>> >> >> >> >>>> >>> >> >> >>>> >>> >> >>>> >>> >>>> oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=504d37b566d9fd442d45904a00584b4f61c93c5d49dc59eb1c948b3d1c096907,mems_allowed=0-1,global_oom,task_memcg=/docker/3826be8f9115479117ddb8b721ca57585b2bdd58a27c7ed7b38e8d83eb795957,task=ceph-mgr,pid=3941610,uid=167 >>>> >>>> >>> >>>> >>> >> >> >> > [ +0.000697] Out of memory: Killed process 3941610 >>>> >>> (ceph-mgr) >>>> >>> >> >> >> > total-vm:146986656kB, anon-rss:125340436kB, file-rss:0kB, >>>> >>> >> >> shmem-rss:0kB, >>>> >>> >> >> >> > UID:167 pgtables:260356kB oom_score_adj:0 >>>> >>> >> >> >> > [ +6.509769] oom_reaper: reaped process 3941610 >>>> >>> (ceph-mgr), now >>>> >>> >> >> >> > anon-rss:0kB, file-rss:0kB, shmem-rss:0kB >>>> >>> >> >> >> > >>>> >>> >> >> >> > The cluster is stable and operating normally, there's >>>> >>> nothing >>>> >>> >> unusual >>>> >>> >> >> >> going >>>> >>> >> >> >> > on before, during or after the kill, thus it's unclear >>>> what >>>> >>> causes >>>> >>> >> the >>>> >>> >> >> >> mgr >>>> >>> >> >> >> > to balloon, use all RAM and get killed. Systemd logs >>>> >>> aren't very >>>> >>> >> >> helpful: >>>> >>> >> >> >> > they just show normal mgr operations until it fails to >>>> >>> allocate >>>> >>> >> memory >>>> >>> >> >> >> and >>>> >>> >> >> >> > gets killed: https://pastebin.com/MLyw9iVi >>>> >>> >> >> >> > >>>> >>> >> >> >> > The mgr experienced this issue several times in the last >>>> 2 >>>> >>> months, >>>> >>> >> and >>>> >>> >> >> >> the >>>> >>> >> >> >> > events don't appear to correlate with any other events >>>> in >>>> >>> the >>>> >>> >> cluster >>>> >>> >> >> >> > because basically nothing else happened at around those >>>> >>> times. >>>> >>> How >>>> >>> >> >> can I >>>> >>> >> >> >> > investigate this and figure out what's causing the mgr to >>>> >>> consume >>>> >>> >> all >>>> >>> >> >> >> > memory and get killed? >>>> >>> >> >> >> > >>>> >>> >> >> >> > I would very much appreciate any advice! >>>> >>> >> >> >> > >>>> >>> >> >> >> > Best regards, >>>> >>> >> >> >> > Zakhar >>>> >>> >> >> >> > _______________________________________________ >>>> >>> >> >> >> > ceph-users mailing list -- ceph-users@xxxxxxx >>>> >>> >> >> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>> >> >> >> >>>> >>> >> >> >> >>>> >>> >> >> >> _______________________________________________ >>>> >>> >> >> >> ceph-users mailing list -- ceph-users@xxxxxxx >>>> >>> >> >> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>> >> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> > >>>> > >>>> > _______________________________________________ >>>> > ceph-users mailing list -- ceph-users@xxxxxxx >>>> > To unsubscribe send an email to ceph-users-leave@xxxxxxx >>>> >>>> _______________________________________________ ceph-users mailing list -- ceph-users@xxxxxxx To unsubscribe send an email to ceph-users-leave@xxxxxxx