On Wed, Jun 3, 2015 at 6:57 AM, Ulrich Windl <Ulrich.Windl@xxxxxxxxxxxxxxxxxxxx> wrote:
>>> Jan Friesse <jfriesse@xxxxxxxxxx> schrieb am 03.06.2015 um 13:44 in Nachricht
<556EE895.2040107@xxxxxxxxxx>:
> Ulrich Windl napsal(a):
>> (This is a re-send of 2015-05-15, because not the subscribe actually worked)
>> Hello!
>>
>> I've meen monitoring some corosync objctl variables to find out what's going
> on. I have some results, but don't really know what the variables are saying.
> Maybe someone can comment on those; what do they mean?:
>>
>> runtime.totem.pg.mrp.srp.orf_token_rx increases about 142 per second
>> runtime.totem.pg.mrp.srp.memb_merge_detect_tx (and rx) increases about 2 per
> second
>> runtime.totem.pg.mrp.srp.mcast_tx increases about 40 per second
>> runtime.totem.pg.mrp.srp.mcast_rx increaes by only 3 per second
>> runtime.totem.pg.mrp.srp.token_hold_cancel_tx (and rx) increases from 1 to 5
> per second
>> runtime.totem.pg.mrp.srp.mtt_rx_token varies from 0 to 24
>>
>> I wonder whether our configuration looks sane or not, and if not which
> parameters to change.
>
> You didn't send configuration. Configuration is stored in
> /etc/corosync/corosync.conf or /etc/cluster/cluster.conf.
I deliberately skipped it to enforce a "blackbox view" on it.
>
> But yes, increasing of rx_tx values is normal and expected.
Yes, that was easy to guess even for me. But what about "token_hold_cancel_tx" and "mtt_rx_token"?
These are internal Totem operational indicators. token_hold_cancel_tx is how many times a token hold cancel is sent. On a lightly loaded totem network, the token would race around the ring. We introduced a feature which would stop the token on a certain number of rotations and then delay it there for the "token_hold_timeout" period. This would result in latency when a node actually wanted to send a message. So we introduced token hold cancel message, which would tell the cluster to release the token so messages could be sent.
rmtt_rx_token is an internal flow control variable and documented in great detail in the Totem specification. It is difficult to explain to someone that hasn't read the Totem specification hundreds of times, but in essence it is part of the totem algorithm that keeps the flow of messages coming equally from each node as the token rotates around the network. How precisely it does this could warrant a 5 page essay, so I'll spare you the details.
Hope these details help.
regards
-steve
>
>>
>> corosync-1.4.7 of SLES11 running on a Xen paravirtualized host...
>
> I would suggest to ask SUSE, because their version may be different then
> upstream one. Also this is why are you paying support to them, isn't it?
I doubt they changed the fundamental meaning of the variables or the basic protocol.
Or is it a polite way of saying "I don't like to help you"?
If it calms you down: Actually I have open an issue with an odd communication issue with SLES support for weeks.
Regards,
Ulrich
_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss
_______________________________________________ discuss mailing list discuss@xxxxxxxxxxxx http://lists.corosync.org/mailman/listinfo/discuss