Re: [Linux-cluster] cluster failed after 53 hours

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Jan 17, 2005 at 05:31:33PM -0800, Daniel McNeil wrote:
> My 3 node cluster ran tests for 53 hours before hitting a problem.
> 
> 
> Node cl031 hit the 1st problem CMAN: killed by STARTTRANS or
> NOMINATE.  There is a DLM assert on cl031 also, but that is
> after a whole bunch of debug output.  The full logs are
> here (http://developer.osdl.org/daniel/GFS/test.12jan2005/)
> 
> Any ideas on what is going on?
> 
> Here is simplified output (in the README file):
> test started Jan Wed 12 17:18
> hung after Fri Jan 14 22:00
> 
> cl031 got an error in just under 53 hours.
> ==========================================
> Jan 14 22:00:38 cl031 kernel: CMAN: node cl031a has been removed from the cluster : No response to messages

It's the usual thing. missing messages.

patrick


[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux