Re: [PATCH] Allow cpg to send large messages

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Chrissie,
patch looks generally good. Can you please fix indentation so:
- Allocate a buffer to contain a full message. comment is not aligned
- cpg_inst_copy.model_v1_data.cpg_deliver_fn (handle, &group_name, ...
  looks like 6 tabs + 8 spaces on first line and 19 tabs on second and
following lines
- iov[1].iov_len = iovec[i].iov_len - iov_sent is aligned with 4 spaces
instead of one tab
?

Also (and this is bigger problem) I'm not entirely happy with:

usleep(10000);
goto resend;

part. I mean, yes it make sense, on the other hand, it may cause app to
block potentially forever. What do you thing about limited number of
retries?

Last thing is cpghum_LDADD. I'm not entirely sure if libz is really
dependency on some library corosync is using. If so, we don't
necessarily need to check it's presence, but if not, we have to check it
in configure script... Actually, it would be nice to test it anyway.

Regards,
  Honza

Christine Caulfield napsal(a):
> On 16/02/15 14:10, Christine Caulfield wrote:
>> On 13/02/15 12:27, Jan Friesse wrote:
>>> Chrissie,
>>>
>>> Christine Caulfield napsal(a):
>>>> It occurs to me that, as this has the potential to break Virtual
>>>> Synchrony, there should be an option to either disable message
>>>
>>> it took me a while until I've found what exactly you mean by break EVS.
>>> This is something we (or at least I) totally forgot but it looks like
>>> HUGE problem and I don't even think that pcmk is able to handle this
>>> situation (I believe you are talking about situation when one node is
>>> sending long message, and other node will leave and then join again into
>>> membership during long message is sent, so it will not receive that
>>> message).
>>>
>>>> fragmentation or to some indication of the maximum message size that
>>>> will not be fragmented.
>>>>
>>>> Thoughts?
>>>
>>> I'm thinking about following solutions:
>>> - implement deferral of delivery of membership change to client
>>> - some kind of recovery... Both of them is like reimplementing totem
>>> inside libcpg.
>>> - Another solution may be to add extra callback parameter and deliver
>>> also list of nodes who received message.
>>>
>>> Generally, I'm really not very happy with breaking EVS. Yes, loooong
>>> messages use case is weird and not so common outside pcmk and yes,
>>> satellite nodes will break EVS anyway, but for needle we should stay
>>> very conservative.
>>>
>>
>> Agreed, I didn't realise how bad it could be until late into the
>> development here. The 'sledgehammer' solution would be to flag when a
>> confchg has happened during the sending of a long message and if that
>> happens, invalidate the whole send. It would then means retransmitting
>> the whole message again from the start.
>>
> 
> 
> ... and here it is!
> 
> I stopped short of checking the ring state when the message is finally
> delivered, it just checks it at each transmission stage. Which is what
> happens with normal sends, of course.
> 
> Chrissie
> 
> 
> 
> 
> 
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss
> 

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss




[Index of Archives]     [Linux Clusters]     [Corosync Project]     [Linux USB Devel]     [Linux Audio Users]     [Photo]     [Yosemite News]    [Yosemite Photos]    [Linux Kernel]     [Linux SCSI]     [X.Org]

  Powered by Linux