Fwd: Client still connect failed leader after that mon down

Zhi Zhang <zhang.david2011@xxxxxxxxx> · Mon, 21 Dec 2015 17:49:03 +0800

Regards,
Zhi Zhang (David)
Contact: zhang.david2011@xxxxxxxxx
              zhangz.david@xxxxxxxxxxx

---------- Forwarded message ----------
From: Jaze Lee <jazeltq@xxxxxxxxx>
Date: Mon, Dec 21, 2015 at 4:08 PM
Subject: Re: Client still connect failed leader after that mon down
To: Zhi Zhang <zhang.david2011@xxxxxxxxx>

Hello,
I am terrible sorry.
I think we may not need to reconstruct the monclient.{h,cc}, we find
the parameter is mon_client_hunt_interval is very usefull.
When we set mon_client_hunt_interval = 0.5， the time to run a ceph
command is very small even it first connects the down leader mon.

The first time i ask the question was because we find the parameter
from official site
http://docs.ceph.com/docs/master/rados/configuration/mon-config-ref/.
It is write in this

mon client hung interval

Description:The client will try a new monitor every N seconds until it
establishes a connection.
Type:Double
Default:3.0

And we set it. it is not work.

I think may be it is a slip of pen?
The right configuration parameter should be mon client hunt interval

Can someone please help me to fix this in official site?

Thanks a lot.

2015-12-21 14:00 GMT+08:00 Jaze Lee <jazeltq@xxxxxxxxx>:
> right now we use simple msg, and cpeh version is 0.80...
>
> 2015-12-21 10:55 GMT+08:00 Zhi Zhang <zhang.david2011@xxxxxxxxx>:
>> Which msg type and ceph version are you using?
>>
>> Once we used 0.94.1 with async msg, we encountered similar issue.
>> Client was trying to connect a down monitor when it was just started
>> and this connection would hung there. This is because previous async
>> msg used blocking connection mode.
>>
>> After we back ported non-blocking mode of async msg from higher ceph
>> version, we haven't encountered such issue yet.
>>
>>
>> Regards,
>> Zhi Zhang (David)
>> Contact: zhang.david2011@xxxxxxxxx
>>               zhangz.david@xxxxxxxxxxx
>>
>>
>> On Fri, Dec 18, 2015 at 11:41 AM, Jevon Qiao <scaleqiao@xxxxxxxxx> wrote:
>>> On 17/12/15 21:27, Sage Weil wrote:
>>>>
>>>> On Thu, 17 Dec 2015, Jaze Lee wrote:
>>>>>
>>>>> Hello cephers:
>>>>>      In our test, there are three monitors. We find client run ceph
>>>>> command will slow when the leader mon is down. Even after long time, a
>>>>> client run ceph command will also slow in first time.
>>>>> >From strace, we find that the client first to connect the leader, then
>>>>> after 3s, it connect the second.
>>>>> After some search we find that the quorum is not change, the leader is
>>>>> still the down monitor.
>>>>> Is that normal?  Or is there something i miss?
>>>>
>>>> It's normal.  Even when the quorum does change, the client doesn't
>>>> know that.  It should be contacting a random mon on startup, though, so I
>>>> would expect the 3s delay 1/3 of the time.
>>>
>>> That's because client randomly picks up a mon from Monmap. But what we
>>> observed is that when a mon is down no change is made to monmap(neither the
>>> epoch nor the members). Is it the culprit for this phenomenon?
>>>
>>> Thanks,
>>> Jevon
>>>
>>>> A long-standing low-priority feature request is to have the client contact
>>>> 2 mons in parallel so that it can still connect quickly if one is down.
>>>> It's requires some non-trivial work in mon/MonClient.{cc,h} though and I
>>>> don't think anyone has looked at it seriously.
>>>>
>>>> sage
>>>>
>>>> --
>>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
>>> the body of a message to majordomo@xxxxxxxxxxxxxxx
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
>
>
> --
> 谦谦君子

--
谦谦君子
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html