Re: [PATCH] Allow cpg to send large messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 23/02/15 23:48, Andrew Beekhof wrote:
> 
>> On 24 Feb 2015, at 12:08 am, Christine Caulfield <ccaulfie@xxxxxxxxxx> wrote:
>>
>> On 16/02/15 14:10, Christine Caulfield wrote:
>>> On 13/02/15 12:27, Jan Friesse wrote:
>>>> Chrissie,
>>>>
>>>> Christine Caulfield napsal(a):
>>>>> It occurs to me that, as this has the potential to break Virtual
>>>>> Synchrony, there should be an option to either disable message
>>>>
>>>> it took me a while until I've found what exactly you mean by break EVS.
>>>> This is something we (or at least I) totally forgot but it looks like
>>>> HUGE problem and I don't even think that pcmk is able to handle this
>>>> situation (I believe you are talking about situation when one node is
>>>> sending long message, and other node will leave and then join again into
>>>> membership during long message is sent, so it will not receive that
>>>> message).
>>>>
>>>>> fragmentation or to some indication of the maximum message size that
>>>>> will not be fragmented.
>>>>>
>>>>> Thoughts?
>>>>
>>>> I'm thinking about following solutions:
>>>> - implement deferral of delivery of membership change to client
>>>> - some kind of recovery... Both of them is like reimplementing totem
>>>> inside libcpg.
>>>> - Another solution may be to add extra callback parameter and deliver
>>>> also list of nodes who received message.
>>>>
>>>> Generally, I'm really not very happy with breaking EVS. Yes, loooong
>>>> messages use case is weird and not so common outside pcmk and yes,
>>>> satellite nodes will break EVS anyway, but for needle we should stay
>>>> very conservative.
>>>>
>>>
>>> Agreed, I didn't realise how bad it could be until late into the
>>> development here. The 'sledgehammer' solution would be to flag when a
>>> confchg has happened during the sending of a long message and if that
>>> happens, invalidate the whole send. It would then means retransmitting
>>> the whole message again from the start.
>>>
>>
>>
>> ... and here it is!
>>
>> I stopped short of checking the ring state when the message is finally
>> delivered, it just checks it at each transmission stage. Which is what
>> happens with normal sends, of course.
> 
> Rather a small patch in the end, always a good sign :)
> 
> One thing that wasn't completely clear... do the applications need to care about resending or will is the client library taking care of that?
> It seems to be the latter right? 
> 

If the application gets CS_ERR_INTERRUPT* then it will need to resend
the message, it's a bit like EAGAIN in that sense.

Chrissie

*If anyone has a better choice for a return code or thinks I should
invent a new one then please say so


_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux