Re: RFC: NUMA modifications to cyclictest

Thomas Gleixner <tglx@xxxxxxxxxxxxx> · Wed, 20 Jan 2010 07:51:41 +0100 (CET)

On Tue, 19 Jan 2010, Clark Williams wrote:
> RT-ers,
> 
> Lately we've been struggling with some performance issues on high-core
> count (>16 cores) NUMA machines with the RT kernel. During the course
> of troubleshooting this issue, we tried using the 'numactl' program to
> constrain our measurement testing tool (rteval) to a particular memory
> node, rather than letting everything float. Doing so showed marked
> improvement in both max latency and jitter.  While this doesn't solve
> our performance problems I thought it might make sense to have a --numa
> mode for cylictest that compliments the --smp mode just added. 
> 
> The big difference here is that when using --numa, each measurement
> thread (one per cpu) has it's stack allocated from the memory node
> associated with it's cpu. Also, the major data structures for each
> thread (parameter block, statistics block and histogram) are allocated
> from the appropriate node. This is done with calls into libnuma,
> which means this will add a dependency on libnuma. 

That might cause some trouble for embedded folks. :(

> The intent is to measure latency on a numa system in the same way a
> well-written RT application would run on a NUMA machine, that is
> minimizing the off-node memory references. 

Agreed.

	tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-rt-users" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html