Re: avoid 3-mds fs laggy on 1 rejoin?

John Spray <jspray@xxxxxxxxxx> · Tue, 6 Oct 2015 13:01:20 +0100

On Tue, Oct 6, 2015 at 12:07 PM, Dzianis Kahanovich
<mahatma@xxxxxxxxxxxxxx> wrote:
> John Spray пишет:
>>
>> On Tue, Oct 6, 2015 at 11:43 AM, Dzianis Kahanovich
>> <mahatma@xxxxxxxxxxxxxx> wrote:
>>>
>>> Short: how to sure avoid (if possible) fs freezes on 1 of 3 mds rejoin?
>>>
>>> ceph version 0.94.3-242-g79385a8
>>> (79385a85beea9bccd82c99b6bda653f0224c4fcd)
>>>
>>> I moving 2 VM clients from ocfs2 (starting to deadlock VM on snapshot) to
>>> cephfs (at least I can backup it). May be I just don't see it before, may
>>> be
>>> there are cephfs pressure problem, but while 1 of 3 mds rejoin (slow!) -
>>> whole mds cluster stuck (but, good news - all clients alive after). How
>>> to
>>> make mds cluster reliable on at least 1 restart?
>>
>>
>> It's not exactly clear to me how you've got this set up.  What's the
>> output of "ceph status"?
>
>
>     cluster 4fc73849-f913-4689-b6a6-efcefccae8d1
>      health HEALTH_OK
>      monmap e1: 3 mons at
> {a=10.227.227.101:6789/0,b=10.227.227.103:6789/0,c=10.227.227.104:6789/0}
>             election epoch 28556, quorum 0,1,2 a,b,c
>      mdsmap e7136: 1/1/1 up {0=c=up:active}, 1 up:standby-replay, 1
> up:standby
>      osdmap e158986: 15 osds: 15 up, 15 in
>       pgmap v60013179: 6032 pgs, 8 pools, 6528 GB data, 2827 kobjects
>             16257 GB used, 6005 GB / 22263 GB avail
>                 6032 active+clean
>   client io 3211 kB/s rd, 1969 kB/s wr, 176 op/s

OK, thanks.  So the symptom is that when you have an MDS failure, the
standby-replay guy is coming up, but he is spending too long in
'rejoin' state, right?  How long, exactly?

You could try setting a higher debug level (e.g. debug mds = 10) on
your MDS before it takes over, so that the log output can give us an
idea of what the daemon is doing while it's stuck in rejoin.

John

>
> PS I know - PGs too much, "mon pg warn max per osd = 1400"...
>
>
>>
>> John
>>
>>>
>>> My current mds config:
>>>
>>> [mds]
>>>          mds recall state timeout = 120
>>>          mds bal mode = 1
>>>          mds standby replay = true
>>>          mds cache size = 500000
>>>          mds mem max = 2097152
>>>          mds op history size = 50
>>>          # vs. laggy beacon
>>>          mds decay halflife = 9
>>>          mds beacon interval = 8
>>>          mds beacon grace = 30
>>>
>>> [mds.a]
>>>          host = megaserver1
>>> [mds.b]
>>>          host = megaserver3
>>> [mds.c]
>>>          host = megaserver4
>>>
>>> (I trying to unswitch all non-defaults, IMHO no results - fixme)
>>> Or may be I need special care on mds stop (now - SIGKILL).
>>>
>>> --
>>> WBR, Dzianis Kahanovich AKA Denis Kaganovich,
>>> http://mahatma.bspu.unibel.by/
>>> _______________________________________________
>>> ceph-users mailing list
>>> ceph-users@xxxxxxxxxxxxxx
>>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>> _______________________________________________
>> ceph-users mailing list
>> ceph-users@xxxxxxxxxxxxxx
>> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>>
>
>
> --
> WBR, Dzianis Kahanovich AKA Denis Kaganovich, http://mahatma.bspu.unibel.by/
_______________________________________________
ceph-users mailing list
ceph-users@xxxxxxxxxxxxxx
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com