For unexplained reasons, we just had our CS service (WATSON) go down on its own, and the syslog entry details the event as:
May 7 13:18:39 db1 clurgmgrd[17888]: <err> #48: Unable to obtain cluster lock: Connection timed out
May 7 13:18:41 db1 kernel: dlm: Magma: reply from 2 no lock
May 7 13:18:41 db1 kernel: dlm: reply
May 7 13:18:41 db1 kernel: rh_cmd 5
May 7 13:18:41 db1 kernel: rh_lkid 200242
May 7 13:18:41 db1 kernel: lockstate 2
May 7 13:18:41 db1 kernel: nodeid 0
May 7 13:18:41 db1 kernel: status 0
May 7 13:18:41 db1 kernel: lkid ee0388
May 7 13:18:41 db1 clurgmgrd[17888]: <notice> Stopping service WATSON
... and its service entry looks like this:
<service autostart="0" domain="DB" exclusive="1" name="WATSON" recovery="disable">
<ip address="192.168.3.111" monitor_link="1"/>
<fs device="/dev/VGWATSON/lvoldata" force_fsck="0" force_unmount="1" fsid="53188" fstype="ext3" mountpoint="/watson-data" name="WATSON-lvoldata" options="" self_fence="0">
<fs device="/dev/VGWATSON/lvoldb1" force_fsck="0" force_unmount="1" fsid="29524" fstype="ext3" mountpoint="/watson-data/sys/db1" name="WATSON-lvoldb1" options="" self_fence="0"/>
<script file="/etc/init.d/WATSON" name="WATSON RC"/>
</fs>
<clusterfs ref="WATSON-lvol0">
<clusterfs ref="WATSON-lvol1"/>
</clusterfs>
</service>
Robert Hurst, Sr. Caché Administrator Beth Israel Deaconess Medical Center 1135 Tremont Street, REN-7 Boston, Massachusetts 02120-2140 617-754-8754 ∙ Fax: 617-754-8730 ∙ Cell: 401-787-3154 Any technology distinguishable from magic is insufficiently advanced. |
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster