Pete, thanks a lot! I'm learning. Bob Pete Huckelba wrote: > At 07:35 PM 10/14/2002, you wrote: > >> I'm new to the CPU-level stuff and don't have much background in >> them. So some of the terms are new for me. Let me get really basic: >> why would you want multiple processors on a motherboard as opposed to >> a single fast processor? > > > More is always better, unless you start talking about a drowning man > and cups of water. As for wanting multiple processors instead of a > single fast one, you of course would always want to have the fastest > processor(s) in the greatest numbers. In some multi-user environments > jobs are numerous, but small and are often run by many users > simultaneously. In this case, a multi-processor machine would be > preferable over a single processor machine since the jobs can be run > at the same time without being queued. Jobs that take more time will > benefit from a faster processor. What you want to keep in mind when > looking at a multi-processor machine is the type of application(s) you > plan on running. If the application is not multi-threaded, you will > not benefit from the extra processor(s) unless you are running > multiple jobs simultaneously. > >> I take it that the Xeon line is for multiple CPU motherboards -- you >> don't just run one Xeon, am I right? What does it mean, to be 'cpu >> cache bound'? > > > Yes, Xeon's were designed for MP. From what I understand, you can run > a single Xeon, which in effect is a just a P4. You can of course read > more and have a visual comparison at: > http://intel.com/support/processors/xeon/diff.htm and > http://intel.com/support/processors/pentium4/p4compare.htm or google > it. As for being cache-bound, (my layman's understanding) just think > of the cache as being a super-fast storage area. Instead of having to > pull the data across the FSB, the data is stored on the CPU-die making > access time as close to real time as you can get. More info at: > http://www-2.cs.cmu.edu/~tcm/thesis/subsubsection2_10_1_3_2.html > >> Do your comments also mean the Red Hat kernel won't need testing on >> the new Hyper Threaded P4s? > > > I have a couple of dual 2Ghz Xeons each with 2GB of PC800, one running > hyperthreaded, one running normal. Depending on the job, the overhead > associated with running hyperthreaded is enormous. Here are some stats > for you: > > [cph@blur ~]$ cat /proc/version /proc/cpuinfo /proc/meminfo > Linux version 2.4.9-34smp (bhcompile@daffy.perf.redhat.com) (gcc > version 2.96 20000731 (Red Hat Linux 7.2 2.96-108.1)) #1 SMP Sat Jun 1 > 06:15:25 EDT 2002 > processor : 0 > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) XEON(TM) CPU 2.00GHz > stepping : 4 > cpu MHz : 1995.162 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr mca cmov pat > pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > bogomips : 3984.58 > > processor : 1 > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) XEON(TM) CPU 2.00GHz > stepping : 4 > cpu MHz : 1995.162 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr mca cmov pat > pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > bogomips : 3984.58 > > total: used: free: shared: buffers: cached: > Mem: 2107416576 2077827072 29589504 0 458694656 1494269952 > Swap: 1077501952 0 1077501952 > MemTotal: 2058024 kB > MemFree: 28896 kB > MemShared: 0 kB > Buffers: 447944 kB > Cached: 1459248 kB > SwapCached: 0 kB > Active: 1211376 kB > Inact_dirty: 454344 kB > Inact_clean: 241472 kB > Inact_target: 524016 kB > HighTotal: 1178560 kB > HighFree: 15324 kB > LowTotal: 879464 kB > LowFree: 13572 kB > SwapTotal: 1052248 kB > SwapFree: 1052248 kB > > and the machine with hyperthreading enabled shows: > > [cph@conroe ~]$ cat /proc/cpuinfo > processor : 0 > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) XEON(TM) CPU 2.00GHz > stepping : 4 > cpu MHz : 1995.164 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > bogomips : 3984.58 > > processor : 1 > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) XEON(TM) CPU 2.00GHz > stepping : 4 > cpu MHz : 1995.164 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > bogomips : 3984.58 > > processor : 2 > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) XEON(TM) CPU 2.00GHz > stepping : 4 > cpu MHz : 1995.164 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > bogomips : 3984.58 > > processor : 3 > vendor_id : GenuineIntel > cpu family : 15 > model : 2 > model name : Intel(R) XEON(TM) CPU 2.00GHz > stepping : 4 > cpu MHz : 1995.164 > cache size : 512 KB > fdiv_bug : no > hlt_bug : no > f00f_bug : no > coma_bug : no > fpu : yes > fpu_exception : yes > cpuid level : 2 > wp : yes > flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov > pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm > bogomips : 3984.58 > > [cph@conroe ~]$ cat /proc/meminfo > total: used: free: shared: buffers: cached: > Mem: 2113667072 2008354816 105312256 0 212000768 1683333120 > Swap: 2146754560 0 2146754560 > MemTotal: 2064128 kB > MemFree: 102844 kB > MemShared: 0 kB > Buffers: 207032 kB > Cached: 1643880 kB > SwapCached: 0 kB > Active: 948732 kB > Inact_dirty: 848948 kB > Inact_clean: 62824 kB > Inact_target: 372100 kB > HighTotal: 1178560 kB > HighFree: 19404 kB > LowTotal: 885568 kB > LowFree: 83440 kB > SwapTotal: 2096440 kB > SwapFree: 2096440 kB > Committed_AS: 7888 kB > > > As a speed test, I ran some test certifications of our statistical > software, the machine running in hyperthreaded mode was significantly > slower than the dual Xeon running in "native" mode. We have a > certification script that I put in a batch, kicking off two on the > dual Xeon and four on the dual Xeon running hyperthreaded. > > non-hyper: > > real 33m38.261s > user 31m38.750s > sys 1m48.790s > > real 33m40.230s > user 31m40.630s > sys 1m48.660s > > hyperthread enabled: > > real 58m31.635s > user 56m5.110s > sys 2m5.660s > > real 58m43.463s > user 56m10.390s > sys 2m8.450s > > real 58m56.267s > user 56m20.470s > sys 2m10.940s > > real 58m59.632s > user 56m27.340s > sys 2m6.860s > > So, while it could be argued that the hyperthreaded machine suffered a > bit from being I/O bound on the harddrive, that period of time was > negligible versus being cache-bound as stated by Mr. Flory below. Also > to note there is some overhead which I have not investigated on how > the kernel handles hyperthreading. A top shows: > > PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME COMMAND > 1 root 15 0 404 404 356 S 0.0 0.0 0:10 init > 2 root 0K 0 0 0 0 SW 0.0 0.0 0:00 migration_CPU0 > 3 root 0K 0 0 0 0 SW 0.0 0.0 0:00 migration_CPU1 > 4 root 0K 0 0 0 0 SW 0.0 0.0 0:00 migration_CPU2 > 5 root 0K 0 0 0 0 SW 0.0 0.0 0:00 migration_CPU3 > 6 root 15 0 0 0 0 SW 0.0 0.0 0:00 keventd > 7 root 34 19 0 0 0 SWN 0.0 0.0 0:02 ksoftirqd_CPU0 > 8 root 34 19 0 0 0 SWN 0.0 0.0 0:00 ksoftirqd_CPU1 > 9 root 34 19 0 0 0 SWN 0.0 0.0 0:00 ksoftirqd_CPU2 > 10 root 34 19 0 0 0 SWN 0.0 0.0 0:00 ksoftirqd_CPU3 > > Where "migration_CPUX" is not seen on the non-hyperthreaded version. > What is interesting to notice is that the migration process never > consumes time, memory or CPU. hmmmmmmmm. > > Pete > > >> Thanks >> >> Bob Cochran >> >> Samuel Flory wrote: >> >>> Red Hat has support this since one of the 7.2 kernel updates. This >>> is old hat on the current crop of Xeon (aka P4 Xeon). Linux treats >>> them as multiple cpus. Don't assume that this will make your system >>> faster. If you tend to only one process active at a time then it >>> will slow things down. It's also really bad if you are cpu cache bound. >>> >>> >>> >> >> >> >> -- >> Psyche-list mailing list >> Psyche-list@redhat.com >> https://listman.redhat.com/mailman/listinfo/psyche-list > > > > -------------------------- > Pete Huckelba > > Stata Corporation > 4905 Lakeway Drive > College Station, TX 77845 > (979)696-4600 > > > -- Psyche-list mailing list Psyche-list@redhat.com https://listman.redhat.com/mailman/listinfo/psyche-list