On Tue, 2005-04-26 at 01:32 -0700, William Yu wrote: > Linux 2.6 does have NUMA support. But whether it's actually a for > Postgres is debatable due to the architecture. > > First let's take a look at how NUMA makes this run faster in a 2x > Opteron system. The idea is that the processes running on CPU0 can > access memory attached to that CPU a lot faster than memory attached to > CPU1. So in a NUMA-aware system, a process running on CPU0 can request > all it's memory be located memory bank0. [...] This is only part of the truth. You should compare it with real SMP solutions. The idea is that CPU0 can access directly attached memory faster than it would on a SMP system, given equivalent or so technology, of course. So NUMA has a fast path and a slow path, while SMP has only one, uniform, medium path. The whole point is where the SMP path lies. If it's close to the fast (local) path in NUMA, then NUMA won't pay off (performance wise) unless the application is NUMA-aware _and_ NUMA-friendly (which depends on how the application is writter, assuming the underlying problem _can_ be solved in a NUMA-friendly way). If the SMP path is close to the slow (remote) path in NUMA (for example, they have to keep the caches coherent, and obey to memory barriers and locks) then NUMA has little to loose for NUMA-unaware or NUMA-unfriendly applications (worst case), and a lot to gain when some NUMA-aware optimizations kick in. I've read some articles that refer to the name SUMO (sufficiently uniform memory organization) AMD would use to describe their NUMA, which seems to imply that their worst case is "sufficiently" close to the usual SMP timing. There are other interesting issues in SMP scaling, on the software side. Scaling with N > 8 might force partitioning at software level anyway, in order to reduce the number of locks, both as software objects (reduce software complexity) and as hardware events (reduce time spent in useless synchronizations). See: http://www.bitmover.com/llnl/smp.pdf This also affects ccNUMA, of course, I'm not saying NUMA avoids this in any way. But it's a price _both_ have to pay, moving their numbers towards the worst case anyway (which makes the worst case not so worse). .TM. -- ____/ ____/ / / / / Marco Colombo ___/ ___ / / Technical Manager / / / ESI s.r.l. _____/ _____/ _/ Colombo@xxxxxx ---------------------------(end of broadcast)--------------------------- TIP 7: don't forget to increase your free space map settings