Re: ceph_leadership_team_meeting_s18e06.mkv

"David Orman" <ormandj@xxxxxxxxxxxx> · Fri, 08 Sep 2023 10:23:53 -0500

I would suggest updating: https://tracker.ceph.com/issues/59580

We did notice it with 16.2.13, as well, after upgrading from .10, so likely in-between those two releases.

David

On Fri, Sep 8, 2023, at 04:00, Loïc Tortay wrote:
> On 07/09/2023 21:33, Mark Nelson wrote:
>> Hi Rok,
>> 
>> We're still try to catch what's causing the memory growth, so it's hard 
>> to guess at which releases are affected.  We know it's happening 
>> intermittently on a live Pacific cluster at least.  If you have the 
>> ability to catch it while it's happening, there are several 
>> approaches/tools that might aid in diagnosing it. Container deployments 
>> are a bit tougher to get debugging tools working in though which afaik 
>> has slowed down existing attempts at diagnosing the issue.
>> 
> Hello,
> We have a cluster recently upgraded from Octopus to Pacific 16.2.13 
> where the active MGR was OOM-killed a few times.
>
> We have another cluster that was recently upgraded from 16.2.11 to 
> 16.2.14 and the issue also started to appear (very soon) on that cluster.
> We didn't have the issue before, during the months running 16.2.11.
>
> In short: the issue seems to be due to a change in 16.2.12 or 16.2.13.
>
>
> Loïc.
> -- 
> |       Loīc Tortay <tortay@xxxxxxxxxxx> - IN2P3 Computing Centre      |
> _______________________________________________
> ceph-users mailing list -- ceph-users@xxxxxxx
> To unsubscribe send an email to ceph-users-leave@xxxxxxx
_______________________________________________
ceph-users mailing list -- ceph-users@xxxxxxx
To unsubscribe send an email to ceph-users-leave@xxxxxxx