Re: Ceph mds memory leak while replay

"Yan, Zheng" <ukernel@xxxxxxxxx> · Fri, 26 Oct 2018 16:36:13 +0800



On Fri, Oct 26, 2018 at 3:53 PM Johannes Schlueter
<bleaktradition@xxxxxxxxx> wrote:
>
> Hello,
> thanks for the reply.
> Before the restart there was HEALTH OK and for a few moments "slow request".
>
> Maybe helpful:
> #
> Events by type:
>   COMMITED: 12188
>   EXPORT: 196
>   IMPORTFINISH: 197
>   IMPORTSTART: 197
>   OPEN: 28096
>   SESSION: 2
>   SESSIONS: 64
>   SLAVEUPDATE: 8440
>   SUBTREEMAP: 256
>   UPDATE: 124222
> Errors: 0
>

how about rank1 (cephfs-journal-tool --rank 1 event get summary)


> Yan, Zheng <ukernel@xxxxxxxxx> schrieb am Fr., 26. Okt. 2018, 09:13:
>>
>> On Fri, Oct 26, 2018 at 2:41 AM Johannes Schlueter
>> <bleaktradition@xxxxxxxxx> wrote:
>> >
>> > Hello,
>> >
>> > os: ubuntu bionic lts
>> > ceph v12.2.7 luminous (on one node we updated to ceph-mds 12.2.8 with no luck)
>> > 2 mds and 1 backup mds
>> >
>> > we just experienced a problem while restarting a mds. As it has begun to replay the journal, the node ran out of memory.
>> > A restart later, after giving about 175GB of swapfile, it still breaks down.
>> >
>> > As mentioned in an maillist entry with similar problem earlier, we restarted all mds nodes causing all nodes to leak. Now they just switch around as they break down and the backup starts the replay.
>> >
>>
>> Did you see warning "Behind on trimming" before mds restart?
>>
>>
>>
>> > Sincerely
>> >
>> > Patrick
>> > _______________________________________________
>> > ceph-users mailing list
>> > ceph-users@xxxxxxxxxxxxxx
>> > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com