Hi,
I am running RHEL3 ES with the RedHat Cluster Suite (not GFS, simply
failover cluster).
The clustered application does a lot of printing (lprng),
faxing(hylafax) and mailing(sendmail). It uses shell scripts to pass the
jobs to the operating systems daemons.
The client programs of these daemons, which pass jobs to the daemons
using network connections to localhost start to behave irregular when
the cluster is up for about 2 weeks.
Examples:
- hylafax faxstat stops listing the transmitted faxes in the middle of
the list ( but always at the same job )
- sendmail opens a connection to the local daemon but does not transfer
the message. Both processes sit there and wait, after some time the
server closes the connection because of missing input from the clients side.
- same with lpr.
I assume that something locks up in the ip stack. Not all services are
affected at the same time.
I guess this is related to the cluster software as we run that
application on a lot of servers which all do not show this behaviour and
that are all not clustered.
Any hints?
regards, Gunther
begin:vcard
fn:Gunther Schlegel
n:Schlegel;Gunther
org:Riege Software International GmbH;IT Infrastructure
adr:;;Mollsfels 10;Meerbusch;;40670;Germany
email;internet:schlegel@xxxxxxxxx
title:Manager IT Infrastructure
tel;work:+49-2159-91480
tel;fax:+49-2159-9148-11
x-mozilla-html:FALSE
url:http://riege.com
version:2.1
end:vcard
--
Linux-cluster@xxxxxxxxxx
http://www.redhat.com/mailman/listinfo/linux-cluster