On 13/02/15 12:27, Jan Friesse wrote: > Chrissie, > > Christine Caulfield napsal(a): >> It occurs to me that, as this has the potential to break Virtual >> Synchrony, there should be an option to either disable message > > it took me a while until I've found what exactly you mean by break EVS. > This is something we (or at least I) totally forgot but it looks like > HUGE problem and I don't even think that pcmk is able to handle this > situation (I believe you are talking about situation when one node is > sending long message, and other node will leave and then join again into > membership during long message is sent, so it will not receive that > message). > >> fragmentation or to some indication of the maximum message size that >> will not be fragmented. >> >> Thoughts? > > I'm thinking about following solutions: > - implement deferral of delivery of membership change to client > - some kind of recovery... Both of them is like reimplementing totem > inside libcpg. > - Another solution may be to add extra callback parameter and deliver > also list of nodes who received message. > > Generally, I'm really not very happy with breaking EVS. Yes, loooong > messages use case is weird and not so common outside pcmk and yes, > satellite nodes will break EVS anyway, but for needle we should stay > very conservative. > Agreed, I didn't realise how bad it could be until late into the development here. The 'sledgehammer' solution would be to flag when a confchg has happened during the sending of a long message and if that happens, invalidate the whole send. It would then means retransmitting the whole message again from the start. Chrissie > >> >> Chrissie >> >> On 12/02/15 16:39, Christine Caulfield wrote: >>> As we discussed at the cluster summit, increasing the message size >>> inside corosync itself is not only dangerous, but is only needed for a >>> very few corner cases .. all of which involve CPG. >>> >>> So, to allow large CPG messages (which is needed) I have added an extra >>> facility to libcpg that will fragment messages that are too large for >>> corosync's internal buffers. It does this transparently to the >>> application. zero-copy sends are NOT supported for this feature. >>> >>> I've also included a test program 'cpghum' that can test this facility >>> with message sequence numbers and checksums. >>> >>> Signed-Off-By: Christine Caulfield <ccaulfie@xxxxxxxxxx> >>> >>> >>> >>> _______________________________________________ >>> discuss mailing list >>> discuss@xxxxxxxxxxxx >>> http://lists.corosync.org/mailman/listinfo/discuss >>> >> >> _______________________________________________ >> discuss mailing list >> discuss@xxxxxxxxxxxx >> http://lists.corosync.org/mailman/listinfo/discuss >> > _______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss