On Mon, Apr 11, 2005 at 05:13:06PM -0700, Daniel McNeil wrote: > I started my mount/tar/rm/ tests on Apr 4 17:41 and I hit > a problem at Apr 6 05:30. So the test ran for 36 hours. > cl030 and cl031 were getting "SM: process_reply invalid" > messages and cl032 got "No response" and "Missed too many > heartbeats" The SM messages are an effect of CMAN removing nodes. There's a fair chance that this recent fix will help: http://sources.redhat.com/ml/cluster-cvs/2005-q2/msg00018.html -- Dave Teigland <teigland@xxxxxxxxxx> -- Linux-cluster@xxxxxxxxxx http://www.redhat.com/mailman/listinfo/linux-cluster