Re: [Hamme-r][Simple Msg]Cluster can not work when Accepter::entry quit

Gregory Farnum <gfarnum@xxxxxxxxxx> · Tue, 10 Apr 2018 12:34:41 -0700



Awesome! Can you submit a PR for discussion and merge? :)
-Greg

On Tue, Apr 10, 2018 at 7:58 AM, xiangyang yu <penglaiyxy@xxxxxxxxx> wrote:
> Hi cepher,
> I have committed a patch solving the problem , and i have tested it
> and the osd pushed out of the cluster when Accepter:entry break out .
> Hope to merge in the next jewel release 10.2.11.
> https://github.com/ceph/ceph/commit/dfb4b01a4654aad84ea4388865b1097052e50004
>
> Best regards,
> brandy
>
> 2018-04-04 5:58 GMT+00:00 Gregory Farnum <gfarnum@xxxxxxxxxx>:
>> On Mon, Apr 2, 2018 at 5:52 PM, xiangyang yu <penglaiyxy@xxxxxxxxx> wrote:
>>> Hi Gregory,
>>>
>>> But if there are some other errors (not  fd limits error), the
>>> Accepter::entry will also go away.
>>>
>>>
>>> int sd = ::accept(listen_sd, (sockaddr*)&addr.ss_addr(), &slen);
>>>     if (sd >= 0)
>>>     {
>>>       int r = set_close_on_exec(sd);
>>>       if (r) {
>>> ldout(msgr->cct,0) << "accepter set_close_on_exec() failed "
>>>       << cpp_strerror(r) << dendl;
>>>       }
>>>       errors = 0;
>>>       ldout(msgr->cct,10) << "accepted incoming on sd " << sd << dendl;
>>>
>>>       msgr->add_accept_pipe(sd);
>>>     }
>>>     else
>>>     {
>>>       ldout(msgr->cct,0) << "accepter no incoming connection?  sd = " << sd
>>>       << " errno " << errno << " " << cpp_strerror(errno) << dendl;
>>>       if (++errors > 4)
>>>            break;
>>>     }
>>>   }
>>>
>>> In my opition , it 's better to do some work(e.g.  shutdown osd) when
>>> Accepter::entry goes away.
>>
>> Ah, I didn't realize we quit accepting any connections on an error.
>> I'm not really sure what the rationale for that is. Some combination
>> of integrating the Accepter with heartbeating, and simply looping
>> around on errors, seems like a reasonable thing to do. The main issue
>> I see is we don't want to burn up a CPU trying to accept incoming
>> connections when there aren't resources available to do it with, so
>> there should probably be some kind of backoff?
>> -Greg
>>
>>
>>>
>>> Best wishes,
>>> brandy
>>>
>>> 2018-04-02 17:56 GMT+00:00 Gregory Farnum <gfarnum@xxxxxxxxxx>:
>>>> On Fri, Mar 30, 2018 at 11:47 PM, xiangyang yu <penglaiyxy@xxxxxxxxx> wrote:
>>>>> Hi cephers,
>>>>>
>>>>> Recently there has been a big problem in our production ceph
>>>>> cluster.It has been running very well for one and a half years.
>>>>>
>>>>> RBD client network and ceph public network are different,
>>>>> communicating through a router.
>>>>>
>>>>> Our ceph version is 0.94.5. Our IO transport is using Simple Messanger.
>>>>>
>>>>> Yesterday some of our VM (using qemu librbd) can not send IO to ceph cluster.
>>>>>
>>>>> Ceph status is healthy and no osd up/down and no pg inactive and down.
>>>>>
>>>>> When we export an rbd image through rbd export ,we find the rbd client
>>>>> can not connect to one osd just to say osd.34.
>>>>>
>>>>> We find thant osd.34 up and running ,but in the log we find some
>>>>> errors as follows:
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>> accepter no incoming connection?  sd =-1 ,errer 24, too many open files.
>>>>>
>>>>> We find that our max open files is set to 200000, but filestore fd
>>>>> cache size is too big like 500000.
>>>>> I think we have some wrong fd configurations.But when there are some
>>>>> errors in Accepter::entry() ,it's better to assert the osd process  so
>>>>> that new rbd client can connect to the ceph cluster  and when there
>>>>> are some network probem, the old rbd client can also reconnect to the
>>>>> cluster.
>>>>
>>>> If we asserted here, the OSD would just go into an assert loop as it
>>>> rebooted, all the clients reconnected, and then they ran into its fd
>>>> limit again.
>>>>
>>>> Unfortunately there isn't much we can do about it. This is a
>>>> fundamental thing with Linux fd limits and networked services; you
>>>> just need to tune it correctly. :(
>>>>
>>>> It does become less of a problem in later versions with BlueStore
>>>> (which doesn't use fds) and AsyncMessenger (which uses just as many
>>>> sockets, but fewer threads).
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html