On Fri, May 05, 2006 at 10:18:36AM -0500, Robert M. Hyatt wrote: > > One note. I am running on a quad 875 system, but am using Suse rather > than FC4. It is running perfectly reliable (this is a 4 cpu, dual-core, > 2.2ghz box, 8 processors total). I had problems with FC4 myself, > although it runs perfectly on my normal dual xeon boxes... > > On Fri, 5 May 2006, Bill Davidsen wrote: > > >Michal Szymanski wrote: > > > >>Hi all, > >> > >>I have recently purchased three Supermicro AS1020A-T servers equipped > >>with two dual-core Opterons 280 each. H8DAR-T motherboards, 8 or 12 GB > >>RAM. The systems carry FC4 x86_64 with proprietary driver (made by > >>Adaptec) for the onboard Marvell 88SX6041 SATA Controller. Original > >>(install) kernel 2.6.11-1.1369_FC4smp - unfortunately not upgradable due > >>to the lack of the SATA driver for other kernel versions. > >> > >>All systems crash (either hang with some "machine check exception" > >>kernel messages or reset) when loaded with repeating runs of 1.3gb, CPU > >>intensive with some I/O. I run 2 or 4 jobs simultaneously and they had > >>never survived more than a few hours. > >> ... > >>2. I ran non-SMP 2.6.11 kernel (with Adaptec driver) on another machine. > >>There have been two test repeating 1.3g jobs running on it (each getting > >>50% > >>of the single CPU used by the system) for over 50 hours now, no crashes. > >>Also, a single test job running on SMP kernel gave no crashes in 24 hours. > >> > >What happens if you use only one CPU? Either with a uni kernel (you should > >have gotten one) or "maxcpus=1" in the boot commands. You are running a > >custom kernel with custom drivers, so you really should be asking the > >supplier, all we can do is suggest things which might provide extra > >information. Hi all, I got 3 copies of Roberts' message but none of Bill's :-) Still, I don't quite understand Bill's question ("What happens if you use only one CPU?"). The answer is quoted just above this question! There were no crashes with the system running on non-SMP kernel. In the meantime I got Kingston 1GB modules from my dealer, for testing. Strangely as it seems, I could not crash the machine with Kingston memory running tests as long as 72 hours. It seems, then, that it is a memory issue although I do not understand why the same memory crashes the machine in SMP and does not in non-SMP, under similar load. Also, the Patriot 2GB memory modules (which seem to crash the machines) are on the Supermicro's list of memory recommended for H8DAR-T mobo. One of the machines went back to the dealer (actually to their memory supplier) for tests. The memory guys seem not to trust our crashing experience. We'll see what happens. I am afraid, however, that they will say "the memory is OK". regards, Michal. -- Michal Szymanski (msz at astrouw dot edu dot pl) Warsaw University Observatory, Warszawa, POLAND - : send the line "unsubscribe linux-smp" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html