On Mon, Dec 06, 2004 at 06:52:02PM -0800, Duncan Morgan wrote: > Hello, > > We have apache running on 14 GFS nodes where the web roots are shared > via GFS. Occasionally we see that the load on all nodes rises > dramatically (to 150+) and all httpd processes become dead (D). I know > this is a little lacking in details but does anyone have any insight > into this? We suspected perhaps a cron job was running simultaneously > against the GFS file system on all nodes but have virtually ruled this > out. I am getting the same behaviour on a non-clustered, non GFS system (dual opteron on FC2/x86_64). There is a peak of almost all httpd processes in D-state (that's not really dead, but "uninterruptible sleep", e.g. when the kernel does IO for the userland process). A few seconds later the number of D-processes fall down to less than a dozen and you can watch your load exponentially decrease. Note that 150 is the default MaxClient setting for apache, that's why you get slightly more than 150 load. I guess that's a kernel issue with too many processes accessing the same files. Nothing to do directly with GFS. > Please help - this is very alarming. > > Thanks in advance, > Duncan Morgan > -- Axel.Thimm at ATrpms.net
Attachment:
pgpBjOm66Cr1p.pgp
Description: PGP signature