Performance impact of SMTC varies enormously with the job mix and the
system configuration. The 34K core is based on the 24K pipeline, which
is pretty efficient, so if you're not missing in the caches or doing a
lot of targeted multiplies, there's not always a whole lot of dead
cycles to fill with multiple threads. And to sponge up cache miss
stall cycles, the memory controller has to be able to handle the
multiple streams of outstanding requests, which isn't the case in some
systems that were originally designed for the 24K. I gave a paper at
the HiPEAC conference last January which showed performance impact on
various microbenchmarks of SMTC. This looks to (finally) be freely
downloadable at http://www.springerlink.com/index/787307253g2644h4.pdf If you're not planning on doing anything else with the other VPE (i.e. RTOS or some other scheduling domain), using it for virtual SMP is an option. The kernel will be a little smaller than SMTC, and some internal functions will be faster. You'll be limited to 2-way parallelism, but going from 1 to 2 is always the step that gives the biggest performance increase. The general pattern seems to be that going from 2 to 3 gives you half again what you got from 1->2, three to for gives you half again what you got from 2->3, etc., up to the point where the pipeline saturates or you start thrashing the cache. Most of the experiments I ran showed the sweet spot to be 3 or 4 threads per core. SMTC also has the slight advantage of making all threads use a common ASID space and share the same TLB, while the VPE SMP scheme splits the TLB between the two VPEs. This makes a difference if you're running with large, parallel working sets, but you won't see much of an impact on small benchmarks. I think that the biggest potential advantage of SMTC probably comes not from increasing throughput per se, but from using it in conjunction with the YIELD instruction to provide zero-latency user-mode event handling, but one has to have the right signals wired to the YQ inputs of the core to exploit it. Regards, Kevin K. Anoop P A wrote:
|