Re: Deadlock when using clvmd + OpenAIS + Corosync

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/01/10 22:58, Evan Broder wrote:
[please preserve the CC when replying, thanks]

Hi -
    We're attempting to setup a clvm (2.02.56) cluster using OpenAIS
(1.1.1) and Corosync (1.1.2). We've gotten bitten hard in the past by
crashes leaving DLM state around and forcing us to reboot our nodes,
so we're specifically looking for a solution that doesn't involve
in-kernel locking.

We're also running the Pacemaker OpenAIS service, as we're hoping to
use it for management of some other resources going forward.

We've managed to form the OpenAIS cluster, and get clvmd running on
both of our nodes. Operations using LVM succeed, so long as only one
operation runs at a time. However, if we attempt to run two operations
(say, one lvcreate on each host) at a time, they both hang, and both
clvmd processes appear to deadlock.

When they deadlock, it doesn't appear to affect the other clustering
processes - both corosync and pacemaker still report a fully formed
cluster, so it seems the issue is localized to clvmd.

I've looked at logs from corosync and pacemaker, and I've straced
various processes, but I don't want to blast a bunch of useless
information at the list. What information can I provide to make it
easier to debug and fix this deadlock?


To start with, the best logging to produce is the clvmd logs which can be got with clvmd -d (see the man page for details). Ideally these should be from all nodes in the cluster so they can be correlated. If you're still using DLM then a dlm lock dump from all nodes is often helpful in conjunction with the clvmd logs.

Also, did you know it's possible to use clvmd without the DLM? The -I openais option will tell it to use the Lck service in userspace - though if there are DLM bugs I think we'd like to fix them if possible ;-)


Chrissie

--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster

[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux