Re: Diagnosing random hangs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]



On Dec 18, 2006, at 16:17, Mark Belanger wrote:

I have many different centos machines that are hanging
regulary.  I believe this is due to something our application
is doing - not a centos specific problem.

I have the same problem. I even posted something to this list titled "Strange system hangs" on 11/27 but didn't get any responses.

When the machines hang, there is no access to the console
or remote access(ssh, rsh, etc).

I have that symptom as well. No way to do any debugging after it gets into that state. So I added the following two lines to the /etc/ syslog.conf file:

  kern.*                                        @<central server>
  *.info;mail.none;authpriv.none;cron.none      @<central server>

Should I add any other levels to the selector field? BTW, my systems are running completely stock CentOS distribution EXCEPT for the binary nVidia driver, which was the only way I could get these systems to drive the 20" LCD displays at their native 1600x1200 resolution using the correct refresh rate.

I had another report of a hang this morning, but in this case even though the machine appears frozen (the screen saver is stuck and I can't get to the alternate consoles), I can in fact log into the machine remotely and top shows me that the X server is using 100% of the CPU:

top - 08:44:22 up 10 days, 23:00, 10 users, load average: 1.04, 1.01, 1.00
  Tasks: 115 total,   2 running, 113 sleeping,   0 stopped,   0 zombie
Cpu(s): 99.7% us, 0.3% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi, 0.0% si Mem: 3113468k total, 1361240k used, 1752228k free, 87312k buffers Swap: 3047416k total, 0k used, 3047416k free, 957756k cached

    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
   4381 root      25   0 67748  42m 7776 R 99.8  1.4 782:53.37 X

I also see the following in /var/log/messages:

Dec 18 19:56:02 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000001 Dec 18 19:56:03 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:09 hepdsw04 Synergy 1.3.1: NOTE: CServerProxy.cpp, 315: server is dead Dec 18 19:56:10 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000020 Dec 18 19:56:11 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:18 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000020 Dec 18 19:56:19 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:26 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000020 Dec 18 19:56:27 hepdsw04 kernel: NVRM: Xid (0001:00): 9, Channel 00000020 Instance 00000000 Intr 00100000 Dec 18 19:56:34 hepdsw04 kernel: NVRM: Xid (0001:00): 8, Channel 00000001

What is the meaning of the NVRM entries? The Synergy entry is from the keyboard/mouse sharing Synergy utility (great program BTW, I couldn't live without it).

Anyway, sorry to inject my own problems into this thread, but maybe these hangs are all related.

Alfred

_______________________________________________
CentOS mailing list
CentOS@xxxxxxxxxx
http://lists.centos.org/mailman/listinfo/centos

[Index of Archives]     [CentOS]     [CentOS Announce]     [CentOS Development]     [CentOS ARM Devel]     [CentOS Docs]     [CentOS Virtualization]     [Carrier Grade Linux]     [Linux Media]     [Asterisk]     [DCCP]     [Netdev]     [Xorg]     [Linux USB]
  Powered by Linux