On Thursday 09 October 2008 20:38:43 Janar Kartau wrote: > Like i said, i couldn't find anything in the logs besides eviction > messages after i manually reset the server. Yes, we do use PHP and > sessions which use memcached as a backend. Don't know much about memcached as a backend but I recall we finally patched php so it uses flocks (as far as I remember or you can at least configure how you want to use session-filelocking) and after it apache is pretty stable. No *D*s any more because of this. I don't know what the status is with the php patch but I think it's still somewhere. I need to check back on this. -marc. > > Janar > > Marc Grimme wrote: > > On Thursday 09 October 2008 01:24:51 Janar Kartau wrote: > >> Hi, > >> Recently our three-node webserver cluster started randomly crashing. I > >> never had time to investigate what the problem was, cause i needed to > >> bring them back online again. But it seemed like alla Apache processes > >> just hang (couldn't even kill them).. waiting for something. The only > >> thing that helped, was a reboot for all or couple of the nodes. Anyway, > >> today i encountered this problem at night and i could look into it a > >> little more. I noticed that some of the GFS filesystems were > >> unaccessable (we have 5 of them, mounted on every nide) and of the nodes > >> was completely unaccessable. So i guessed that this half-dead node was > >> holding locks on the filesystems or sth. Did a hard reset on this dead > >> node and all stabilized. > >> Absolutely no cluster/GFS errors in the logs (besides the ones which > >> tell that the half-dead node was leaving the cluster when i reset it). > >> Nodes have CentOS 4.6 installed (2.6.9-67.0.7.ELsmp, dlm-1.0.7-1, > >> GFS-6.1.15-1, cman-1.0.17-0.el4_6.5). We use EMC CX3-10c for GFS storage > >> (over iSCSI) and EMC PowerPath for multipathing. Separate VLAN is used > >> for CMAN/DLM traffic. > >> Please give me ideas how to solve this or atleast some debugging tips as > >> it's happening twice a day now and seems i simply can't help it. :( > > > > Could you provide more information like relevant syslogs and console > > messages? > > > > Are you using php with sessions? -- Gruss / Regards, Marc Grimme Phone: +49-89 452 3538-14 http://www.atix.de/ http://www.open-sharedroot.org/ ** ATIX Informationstechnologie und Consulting AG Einsteinstr. 10 85716 Unterschleissheim Deutschland/Germany Phone: +49-89 452 3538-0 Fax: +49-89 990 1766-0 Registergericht: Amtsgericht Muenchen Registernummer: HRB 168930 USt.-Id.: DE209485962 Vorstand: Marc Grimme, Mark Hlawatschek, Thomas Merz (Vors.) Vorsitzender des Aufsichtsrats: Dr. Martin Buss -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster