Re: Ceph 16.2.14: ceph-mgr getting oom-killed

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

This memory leak with ceph-mgr seems to be due to a change in Ceph 16.2.12.
Check this issue : https://tracker.ceph.com/issues/59580
We are also affected by this, with or without containerized services.

Cheers,
Adrien

Le 22/11/2023 à 14:14, Eugen Block a écrit :
One other difference is you use docker, right? We use podman, could it be some docker restriction?

Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:

It's a 6-node cluster with 96 OSDs, not much I/O, mgr . Each node has 384 GB of RAM, each OSD has a memory target of 16 GB, about 100 GB of memory, give or take, is available (mostly used by page cache) on each node during
normal operation. Nothing unusual there, tbh.

No unusual mgr modules or settings either, except for disabled progress:

{
    "always_on_modules": [
        "balancer",
        "crash",
        "devicehealth",
        "orchestrator",
        "pg_autoscaler",
        "progress",
        "rbd_support",
        "status",
        "telemetry",
        "volumes"
    ],
    "enabled_modules": [
        "cephadm",
        "dashboard",
        "iostat",
        "prometheus",
        "restful"
    ],

/Z

On Wed, 22 Nov 2023, 14:52 Eugen Block, <eblock@xxxxxx> wrote:

What does your hardware look like memory-wise? Just for comparison,
one customer cluster has 4,5 GB in use (middle-sized cluster for
openstack, 280 OSDs):

     PID USER      PR  NI    VIRT    RES    SHR S  %CPU %MEM     TIME+
COMMAND
    6077 ceph      20   0 6357560 4,522g  22316 S 12,00 1,797
57022:54 ceph-mgr

In our own cluster (smaller than that and not really heavily used) the
mgr uses almost 2 GB. So those numbers you have seem relatively small.

Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:

> I've disabled the progress module entirely and will see how it goes.
> Otherwise, mgr memory usage keeps increasing slowly, from past experience > it will stabilize at around 1.5-1.6 GB. Other than this event warning,
it's
> unclear what could have caused random memory ballooning.
>
> /Z
>
> On Wed, 22 Nov 2023 at 13:07, Eugen Block <eblock@xxxxxx> wrote:
>
>> I see these progress messages all the time, I don't think they cause
>> it, but I might be wrong. You can disable it just to rule that out.
>>
>> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:
>>
>> > Unfortunately, I don't have a full stack trace because there's no
crash
>> > when the mgr gets oom-killed. There's just the mgr log, which looks >> > completely normal until about 2-3 minutes before the oom-kill, when
>> > tmalloc warnings show up.
>> >
>> > I'm not sure that it's the same issue that is described in the
tracker.
>> We
>> > seem to have some stale "events" in the progress module though:
>> >
>> > Nov 21 14:56:30 ceph01 bash[3941523]: debug
2023-11-21T14:56:30.718+0000
>> > 7f4bb19ef700  0 [progress WARNING root] complete: ev
>> > cacc4230-75ee-4892-b8fd-a19fec8f9f66 does not exist
>> > Nov 21 14:56:30 ceph01 bash[3941523]: debug
2023-11-21T14:56:30.718+0000
>> > 7f4bb19ef700  0 [progress WARNING root] complete: ev
>> > 44824331-3f6b-45c4-b925-423d098c3c76 does not exist
>> > Nov 21 14:56:30 ceph01 bash[3941523]: debug
2023-11-21T14:56:30.718+0000
>> > 7f4bb19ef700  0 [progress WARNING root] complete: ev
>> > 0139bc54-ae42-4483-b278-851d77f23f9f does not exist
>> > Nov 21 14:56:30 ceph01 bash[3941523]: debug
2023-11-21T14:56:30.718+0000
>> > 7f4bb19ef700  0 [progress WARNING root] complete: ev
>> > f9d6c20e-b8d8-4625-b9cf-84da1244c822 does not exist
>> > Nov 21 14:56:30 ceph01 bash[3941523]: debug
2023-11-21T14:56:30.718+0000
>> > 7f4bb19ef700  0 [progress WARNING root] complete: ev
>> > 1486b26d-2a23-4416-a864-2cbb0ecf1429 does not exist
>> > Nov 21 14:56:30 ceph01 bash[3941523]: debug
2023-11-21T14:56:30.718+0000
>> > 7f4bb19ef700  0 [progress WARNING root] complete: ev
>> > 7f14d01c-498c-413f-b2ef-05521050190a does not exist
>> > Nov 21 14:57:35 ceph01 bash[3941523]: debug
2023-11-21T14:57:35.950+0000
>> > 7f4bb19ef700  0 [progress WARNING root] complete: ev
>> > 48cbd97f-82f7-4b80-8086-890fff6e0824 does not exist
>> >
>> > I tried clearing them but they keep showing up. I am wondering if
these
>> > missing events can cause memory leaks over time.
>> >
>> > /Z
>> >
>> > On Wed, 22 Nov 2023 at 11:12, Eugen Block <eblock@xxxxxx> wrote:
>> >
>> >> Do you have the full stack trace? The pastebin only contains the
>> >> "tcmalloc: large alloc" messages (same as in the tracker issue).
Maybe
>> >> comment in the tracker issue directly since Radek asked for someone
>> >> with a similar problem in a newer release.
>> >>
>> >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:
>> >>
>> >> > Thanks, Eugen. It is similar in the sense that the mgr is getting
>> >> > OOM-killed.
>> >> >
>> >> > It started happening in our cluster after the upgrade to 16.2.14.
We
>> >> > haven't had this issue with earlier Pacific releases.
>> >> >
>> >> > /Z
>> >> >
>> >> > On Tue, 21 Nov 2023, 21:53 Eugen Block, <eblock@xxxxxx> wrote:
>> >> >
>> >> >> Just checking it on the phone, but isn’t this quite similar?
>> >> >>
>> >> >> https://tracker.ceph.com/issues/45136
>> >> >>
>> >> >> Zitat von Zakhar Kirpichenko <zakhar@xxxxxxxxx>:
>> >> >>
>> >> >> > Hi,
>> >> >> >
>> >> >> > I'm facing a rather new issue with our Ceph cluster: from time
to
>> time
>> >> >> > ceph-mgr on one of the two mgr nodes gets oom-killed after
>> consuming
>> >> over
>> >> >> > 100 GB RAM:
>> >> >> >
>> >> >> > [Nov21 15:02] tp_osd_tp invoked oom-killer:
>> >> >> > gfp_mask=0x100cca(GFP_HIGHUSER_MOVABLE), order=0,
oom_score_adj=0
>> >> >> > [  +0.000010] oom_kill_process.cold+0xb/0x10
>> >> >> > [  +0.000002] [  pid  ]   uid tgid total_vm      rss
>> pgtables_bytes
>> >> >> > swapents oom_score_adj name
>> >> >> > [  +0.000008]
>> >> >> >
>> >> >>
>> >>
>>
oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=504d37b566d9fd442d45904a00584b4f61c93c5d49dc59eb1c948b3d1c096907,mems_allowed=0-1,global_oom,task_memcg=/docker/3826be8f9115479117ddb8b721ca57585b2bdd58a27c7ed7b38e8d83eb795957,task=ceph-mgr,pid=3941610,uid=167 >> >> >> > [  +0.000697] Out of memory: Killed process 3941610 (ceph-mgr)
>> >> >> > total-vm:146986656kB, anon-rss:125340436kB, file-rss:0kB,
>> >> shmem-rss:0kB,
>> >> >> > UID:167 pgtables:260356kB oom_score_adj:0
>> >> >> > [  +6.509769] oom_reaper: reaped process 3941610 (ceph-mgr), now
>> >> >> > anon-rss:0kB, file-rss:0kB, shmem-rss:0kB
>> >> >> >
>> >> >> > The cluster is stable and operating normally, there's nothing
>> unusual
>> >> >> going
>> >> >> > on before, during or after the kill, thus it's unclear what
causes
>> the
>> >> >> mgr
>> >> >> > to balloon, use all RAM and get killed. Systemd logs aren't very
>> >> helpful:
>> >> >> > they just show normal mgr operations until it fails to allocate
>> memory
>> >> >> and
>> >> >> > gets killed: https://pastebin.com/MLyw9iVi
>> >> >> >
>> >> >> > The mgr experienced this issue several times in the last 2
months,
>> and
>> >> >> the
>> >> >> > events don't appear to correlate with any other events in the
>> cluster
>> >> >> > because basically nothing else happened at around those times.
How
>> >> can I
>> >> >> > investigate this and figure out what's causing the mgr to
consume
>> all
>> >> >> > memory and get killed?
>> >> >> >
>> >> >> > I would very much appreciate any advice!
>> >> >> >
>> >> >> > Best regards,
>> >> >> > Zakhar
>> >> >> > _______________________________________________
>> >> >> > ceph-users mailing list -- ceph-users@xxxxxxx
>> >> >> > To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >> >>
>> >> >>
>> >> >> _______________________________________________
>> >> >> ceph-users mailing list -- ceph-users@xxxxxxx
>> >> >> To unsubscribe send an email to ceph-users-leave@xxxxxxx
>> >> >>
>> >>
>> >>
>> >>
>> >>
>>
>>
>>
>>






_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx




[Index of Archives]     [Information on CEPH]     [Linux Filesystem Development]     [Ceph Development]     [Ceph Large]     [Ceph Dev]     [Linux USB Development]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [xfs]


  Powered by Linux