Hi, Lately I've been updating our smp machine, and alongside built a second smp machine. The first one, apart from a "stuck on TLB" glitch two months ago never crashed. Lately, some changes had been made. One is they now both run Helix-gnome 1.2 with updates and a distributed net client, along with ofcourse the redhat 6.2 updates. Both machines have become highly unstable when running X on them. But that could just be a manifestation of the extra load the machines receive when it runs. I hardly believe Helix binaries are the cause here. All crashes so far showed no log entries whatsoever. The machine would suddenly become extremely slow, and in a matter of 3-5 seconds, the mouse would freeze along with the entire machine. Today, I managed to get a logentry, though ksymoops can't seem to read it (and I can't read/match the symbols for some odd reason). Aug 31 16:28:49 dupla kernel: Aug 31 16:28:49 dupla kernel: wait_on_bh, CPU 0: Aug 31 16:28:49 dupla kernel: irq: 1 [0 1] Aug 31 16:28:49 dupla kernel: bh: 1 [0 1] Aug 31 16:29:20 dupla kernel: <[c010be9d]> <[c0169cc2]> <[c0169d3d]> <[c017990d]> <[c0151d6f]> <[c013496b]> <[c0134ac7]> stuck on TLB IPI wait (CPU#0) Aug 31 16:29:20 dupla kernel: stuck on TLB IPI wait (CPU#0) Aug 31 16:29:20 dupla kernel: stuck on TLB IPI wait (CPU#0) After three of these, a fourth one happened on CPU#1, then it continued on CPU#0 again. This time I had managed to switch back to console mode just before the system froze completely, and managed to use SysRq-r to remount ro and SysRq-b to boot the machine. Ksymoops said: Warning (Oops_read): Code line not seen, dumping what data is available Trace; c010be9d <synchronize_bh+3d/50> Trace; c0169cc2 <tcp_listen_poll+12/50> Trace; c0169d3d <tcp_poll+3d/100> Trace; c017990d <inet_poll+21/2c> Trace; c0151d6f <sock_poll+1f/24> Trace; c013496b <do_poll+7b/dc> Trace; c0134ac7 <sys_poll+fb/17c> 819 warnings and 1 error issued. Results may not be reliable. The networkcard is an HP 100VG Anylan (driver hp100.o) If needed, I can provide access (including root) on the spare dual CPU machine. This machine is an Asus P2L97-DS, with two P-II Deschutes, 333Mhz. CPU#0 is stepping 0, CPU#1 is stepping 2. As I said, we have two dual CPU systems. The other one has the same symptoms, but is an Asus P2B-DS with two identical P-III KatMai's on 450Mhz, stepping 7. But I've never managed to get a log entry on that one. And since it's a production machine, I'm no longer running X on it [1]. Paul Wouters Xtended Internet [1] I felt really awfull running X on the NIS master to begin with :) -- Broerdijk 27 Postbus 170 Tel: 31-24-360 39 19 6523 GM Nijmegen 6500 AD Nijmegen Fax: 31-24-360 19 99 The Netherlands The Netherlands info@xtdnet.nl - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org