Re: clurgmgrd stops service without reason

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Have you been able to resolve this issue? I have the exact same symptoms
on a RedHat cluster (rgmanger version 1.9.46).

I receive a message "<notice> stopping service fileserver" and the node
shutsdown and ends up rebooting as it cant unmount a partition.

What worries me is that this has happened 3 times in 2 weeks with no
obvous reason as the server is working fine up until that point.

The relevant section of my cluster.conf is

<service autostart="0" domain="main" name="fileservices">
               <fs device="/dev/mapper/livevg-data" force_fsck="1"
force_unmount="1" fsid="11439" fstype="ext3"
mountpoint="/mnt/live" name="live" options="noatime"
self_fence="1"/>
                <fs device="/dev/mapper/backupvg-data" force_fsck="1"
force_unmount="1" fsid="53676" fstype="ext3"
mountpoint="/mnt/backup" name="backup" options="noatime"
self_fence="1"/>
                 <ip address="192.168.11.253" monitor_link="1"/>
                  <ip address="192.168.1.253" monitor_link="1"/>
                    <script file="/etc/init.d/smb-rhcs" name="Samba"/>
                   <script file="/etc/init.d/nfs-rhcs" name="NFS"/>
     </service>

Any thoughts or updates greatly appreciated as this is occuring on a
production server.

Regards

Mark Reynolds




> > On Wed, 2006-08-02 at 13:03 +0200, Falk Hackenberger - MediaTransfer AG
> > Netresearch &amp; Consulting wrote:
> >
> >>--snip--
> >>Aug  1 17:31:28 kain clurgmgrd: [4780]:  Executing
> >>/exports/imap/checkimapstartup.sh status
> >>Aug  1 17:31:28 kain clurgmgrd: [4780]:  Executing
>
>>/exports/subversion/etc/rc.d/init.d/svnserver status
> >>Aug  1 17:31:28 kain clurgmgrd: [4780]:  Checking 192.168.0.223,
> >>Level 0
> >>Aug  1 17:31:28 kain clurgmgrd: [4780]:  192.168.0.223 present on
> >>eth0
> >>Aug  1 17:31:28 kain clurgmgrd: [4780]:  Link for eth0: Detected
> >>Aug  1 17:31:28 kain clurgmgrd: [4780]:  Link detected on eth0
> >>Aug  1 17:31:37 kain clurgmgrd[4780]:  Stopping service storage
> >>--snap--
> >>
> >>how to say to clurgmgrd, that he should log the reason for stoping the
> >>service?
> >
> > Something must be returning an error code where it should not be; can
> > you post
your service XML blob?
>
>it is very long and a little bit complex as i know... ;-)
>
>recovery="restart">


--

Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux