[Linux-cluster] Node hang

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello guys,

After 3 days of a heavy read/write test load one of our nodes crash with the following error:

Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: fatal: invalid metadata block
Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: bh = 13156295 (magic)
Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: function = gfs_get_data_buffer
Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: file = /usr/src/cluster/gfs-kernel/src/gfs/dio.c, line = 1328
Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: time = 1108659988
Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: about to withdraw from the cluster
Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: waiting for outstanding I/O
Feb 17 12:06:28 atmail-1 kernel: GFS: fsid=ISQCLUSTER:gfs001.0: telling LM to withdraw
Feb 17 12:06:35 atmail-1 kernel: lock_dlm: withdraw abandoned memory


We are mounting our GFS partition using the noatime option, and quotas has been disabled in order to improve performance. The aplications currently running are "postfix, apache, and Courier/Imap".

We are using the CVS version available on Feb 14 around 5:00 PM.

Any light with this matter ?
Is there any way to know which file exactly was trying to read or write the server when it crash based on the log ?


Regards
Bujan



[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux