Re: High latency of message passing in pload test

Steven Dake <sdake@xxxxxxxxxx> · Thu, 24 May 2012 07:52:24 -0700

ROn 05/23/2012 07:28 AM, Voznesensky Vladimir wrote:
> Hello.
> 
> I've tested Corosync latency without IPC.
> 
> First, I've found a Pload module and instrumented it's data messages
> with TSC values, so become able to measure CPU ticks spending in the
> Corosync stack for each message.
> Second, I've measured the average ticks to pass 300 bytes messages in
> different batch sizes with and without Totem on a single node.
> Totem disabling was carried out in accordance to the "[PATCH] Example
> short-circuit of totem for single node cases - for TESTING ONLY" message
> of 18 May from this maillist.
> 
> bindnetaddr: 127.0.0.1
> hold: 0
> 
> I have run corosync with the CPU affinity set to a single hyper-thread
> to prevent the process from loosing cache and context switching, though
> it does not help a lot: the results are stable for 10000 and 100000 only.
> 
> # taskset 0x1 /usr/sbin/corosync -f
> 
> 
> The latency (in ticks)/throughput(in TP/S) for each variant:
> 
> batch size| with Totem | w/o Totem
> 1         |  596K/5.4M |  58K/5M
> 10        |  0.8M/1.9M |  82K/3M
> 100       |  5.0M/1.2M |  5M/0.9M
> 1000      |  7.9M/1.4M |  8M/1.3M
> 10000     | 24.4M/2.0M | 23M/2.1M
> 100000    | 31.8M/1.2M | 31M/1.3M
> 
> There is a strong gap in latencies for 1 and 10 message batches: some of
> the cases are small (tens kiloticks), and some of them are big (millions
> ticks). I have no explanation.
> 
> The latency seems to not depend on the totem realization, but looks very
> high. Am I doing something wrong?
>

I would expect totem latency a bit higher, The mean latency can be
calculated as 1/2*token rotation time.  Larger batch sizes result in
more messages being assembled into a MTU sized message which will
provide better throughput (at the cost of latency, because now that
token rotation takes longer).

I don't particularly trust tsc.  I would prefer to see a
clock_gettime(CLOCK_REALTIME) analysis.

Totem doesn't make much difference here because the latency you see when
running with cpg is almost entirely introduced by IPC.  Just for
context, openais 0.80.x to coroysnc 1.0 -> throughput went way up,
latency went way down, corosync 1.0 -> 2.0 throughput went way up
latency went way down.  corosync 2.0 -> 3.0 may follow same pattern.

> Thanks.
> 
> -- VV
> _______________________________________________
> discuss mailing list
> discuss@xxxxxxxxxxxx
> http://lists.corosync.org/mailman/listinfo/discuss

_______________________________________________
discuss mailing list
discuss@xxxxxxxxxxxx
http://lists.corosync.org/mailman/listinfo/discuss