Re: crc error when decode_message?

Gregory Farnum <greg@xxxxxxxxxxx> · Tue, 17 Mar 2015 06:58:13 -0700



On Tue, Mar 17, 2015 at 6:46 AM, Sage Weil <sage@xxxxxxxxxxxx> wrote:
> On Tue, 17 Mar 2015, Ning Yao wrote:
>> 2015-03-16 22:06 GMT+08:00 Haomai Wang <haomaiwang@xxxxxxxxx>:
>> > On Mon, Mar 16, 2015 at 10:04 PM, Xinze Chi <xmdxcxz@xxxxxxxxx> wrote:
>> >> How to process the write request in primary?
>> >>
>> >> Thanks.
>> >>
>> >> 2015-03-16 22:01 GMT+08:00 Haomai Wang <haomaiwang@xxxxxxxxx>:
>> >>> AFAR Pipe and AsyncConnection both will mark self fault and shutdown
>> >>> socket and peer will detect this reset. So each side has chance to
>> >>> rebuild the session.
>> >>>
>> >>> On Mon, Mar 16, 2015 at 9:19 PM, Xinze Chi <xmdxcxz@xxxxxxxxx> wrote:
>> >>>> Such as, Client send write request to osd.0 (primary), osd.0 send
>> >>>> MOSDSubOp to osd.1 and osd.2
>> >>>>
>> >>>> osd.1 send reply to osd.0 (primary), but accident happened:
>> >>>>
>> >>>> 1. decode_message crc error when decode reply msg
>> >>>> or
>> >>>> 2. the reply msg is lost when send to osd.0, so osd.0 do not receive replay msg
>> >>>>
>> >>>> Could anyone tell me what is the behavior if osd.0 (primary)?
>> >>>>
>> >
>> > osd.0 and osd.1 both will try to reconnect peer side, and the lost
>> > message will be resend to osd.0 from osd.1
>> So I wonder if different routing path delays the arrival of one
>> message, then the in_seq would be setting ahead, then based on the
>> logic. Later, if the delaying message arrives, it will be dropping and
>> discard. Thus, if it is just a sub_op reply message as xinze
>> describes, how ceph works after that? It seems repop of the write Op
>> will be waiting infinite times until the osd restart?
>
> These sorts of scenarios are why src/msg/simple/Pipe.cc (an in particular,
> accept()) is not so simple.  The case you describe is
>
>  https://github.com/ceph/ceph/blob/master/src/msg/simple/Pipe.cc#L492
> or
>  https://github.com/ceph/ceph/blob/master/src/msg/simple/Pipe.cc#L492
>
> In other words, this is all masked by the Messenger layer so that the
> higher layers (OSD.cc etc) see a single, ordered, reliable stream of
> messages and all of the failure/retry/reconnect logic is hidden.

Just to be clear, that's the original described case of reconnecting.
The different routing paths stuff are all handled by TCP underneath
us, which is one of the reasons we use it. ;)
-Greg
--
To unsubscribe from this list: send the line "unsubscribe ceph-devel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html