On 07/09/2023 21:33, Mark Nelson wrote:
Hi Rok,
We're still try to catch what's causing the memory growth, so it's hard
to guess at which releases are affected. We know it's happening
intermittently on a live Pacific cluster at least. If you have the
ability to catch it while it's happening, there are several
approaches/tools that might aid in diagnosing it. Container deployments
are a bit tougher to get debugging tools working in though which afaik
has slowed down existing attempts at diagnosing the issue.
Hello,
We have a cluster recently upgraded from Octopus to Pacific 16.2.13
where the active MGR was OOM-killed a few times.
We have another cluster that was recently upgraded from 16.2.11 to
16.2.14 and the issue also started to appear (very soon) on that cluster.
We didn't have the issue before, during the months running 16.2.11.
In short: the issue seems to be due to a change in 16.2.12 or 16.2.13.
Loïc.
--
| Loīc Tortay <tortay@xxxxxxxxxxx> - IN2P3 Computing Centre |
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx