maximum cpus/cores in CentOS 4.1

b.j.smith at ieee.org (Bryan J. Smith) · Thu Sep 8 18:33:57 2005

Peter Arremann <loony@xxxxxxxxxxxx> wrote:
> The suggestion of 8 was made mostly because there was no
> larger x86-64 platform available at that time. 

Opteron 8xx processors are so named because 8-way is the
maximum number of Opteron processors with 3 HyperTransport
links so no other Opteron is more than 2 hops away.  With
more than 8-way, you start to run into excessive hops, and
that requires further design considerations in both hardware
and software.

I know many vendors are selling "scalable" 4-way Socket-940
boards these days with two 3.2-8.0GBps HyperTransport
connectors for daisy chaining mainboards.  But the
HyperTransport eXpansion (HTX) is now the preferred way to
build clusters of 4-way Socket-940 boards, and each system
has its own OS.  Infiniband over HTX is capable of a "real
world" 1.8GBps -- over 100% faster than "real world"
performance of Infiniband PCI-X 2.0 cards (typically used in
Xeon/Itanium).

> Also, the main reason for the limit is scalability. With
> more cpus comes more communications overhead, more
> congestion on the bus,

There is _no_ "bus" in Opteron.  Yes, Opterons will "share"
HyperTransport links when they cannot directly connect to 

> less memory bandwidth for each cpu and so on.

Okay, this is _misleading_.  You're thinking Intel SMP.

Opterons _always_ have 128-bit of DDR (2 channels) per CPU.
Opteron uses NUMA (and HyperTransport partial meshes for
CPU-I/O).  There is _no_ "less memory bandwidth for each
cpu".  That is a trait of Intel SMP [A]GTL+, not AMD
NUMA/HyperTransport.

Yes, if the Opteron has to access memory over on another CPU,
then that is a performance issue.  If the other CPU is on
another mainboard, then yes, contention can happen there.

> I remember in the good old days when smp was first added,
> to the kernel, people said 2cpu was the max you can have...

I remember non-Linux/non-PC where MP, not SMP, was used. 
>From true crossbar switches (not "bus hubs") to the partial
mesh we now have in the Opteron.

In fact, it's one of the areas where Linux is very immature.

It's logic is still very SMP, and only has NUMA "hints," and
does not scale well on a NUMA platform, let alone the partial
mesh of the Opteron 800's 2xDDR/3xHyperTransport _per_ CPU. 
Especially when it comes to processor affinity for I/O.  It's
a crapload better than NT, but not many UNIX implementations.

Sun's support of the Opteron then became a no-brainer.  They
could deliver a partial-mesh platform at a commodity cost.

> we ran a few 4way systems back then very effectivly simply
> because our application had only a low volume of
> communications. 

But you're still accessing memory.  I assume it was an Intel
SMP solution, and therefore, had memory access limitations
you describe.

These are still wholly _inapplicable_ to Opteron if you have
an application and operating system that are effective at
processor affinity for processes.  And when it comes to
communication, processor affinity for I/O can do wonders --
but _only_ on Opteron, not even proprietary Xeon MP / Itanium
systems (because they are still "Front Side Bottleneck"
designs).

I understand what you're trying to say.  But it's not very
applicable to Opteron in the least bit.

-- 
Bryan J. Smith                | Sent from Yahoo Mail
mailto:b.j.smith@xxxxxxxx     |  (please excuse any
http://thebs413.blogspot.com/ |   missing headers)