Your quorum config.
node=2 votes*3 (nodes have 6 votes total)
qdisk=3 votes.
A single node can maintain quorum since 2+3>(9/2).
In a split brain condition where a single node cannot talk to the other nodes, this could be disastrous.
Now, that all said, qdiskd using a volume as yours appears to be, won't be able to start until the cluster is quorate.
Also, you might be running into a chicken and egg situation. Is your qdisk volume marked clustered? I believe once you set the locking type to 3, all LVM activity requires clvmd to be running. If it's not marked as clustered, then that's not going work I don't thinks since qdisk requires concurrent access across nodes. And you have to wait for clvmd.
It's unclear why you actually need a qdisk. If it's to keep the cluster up in a single node mode, then I'd make the qdisk member start up in a minority vote=1 and only change that in a controlled situation where you are sure the other nodes are shutdown completely. Remember, the purpose of quorum is to ensure that a majority rules and your config violates that premise. Just sayin' :)
Does your cluster run without qdiskd configured?
Anyway, I hope this helps at least a little. If I am way off base, I apologize and will crawl back into my cave :)
Good luck
Corey
On Thu, Aug 2, 2012 at 4:50 AM, emmanuel segura <emi2fast@xxxxxxxxx> wrote:
if you think the problem it's in lvm, put it in the debug man lvm.conf2012/8/2 Gianluca Cecchi <gianluca.cecchi@xxxxxxxxx>
On Wed, Aug 1, 2012 at 6:15 PM, Gianluca Cecchi wrote:
> On Wed, 1 Aug 2012 16:26:38 +0200 emmanuel segura wrote:
>> Why you don't remove expected_votes=3 and let the cluster automatic calculate that
>
> Thanks for your answer Emmanuel, but cman starts correctly, while the
> problem seems related to
> vgchange -aly
> command hanging.
> But I tried that option too and the cluster hangs at the same point as before.
Further testing shows that cluster is indeed quorated and problems are
related to lvm...
I also tried following a more used and clean configuration seen in
examples for 3 nodes + quorum daemon:
2 votes for each node
<clusternode name="nodeX" nodeid="X" votes="2">
3 votes for quorum disk
<quorumd device="/dev/mapper/mpathquorum" interval="5"
label="clrhevquorum" tko="24" votes="3">
with and without expected_votes="9" in <cman ... /> part
One node + its quorum only config should be ok (2+3 = 5 votes)
After cman starts and quorumd is not master yet:
# cman_tool status
Version: 6.2.0
Config Version: 51
Cluster Name: clrhev
Cluster Id: 43203
Cluster Member: Yes
Cluster Generation: 1428
Membership state: Cluster-Member
Nodes: 1
Expected votes: 9
Total votes: 2
Node votes: 2
Quorum: 5 Activity blocked
Active subsystems: 4
Flags:
Ports Bound: 0 178
Node name: intrarhev3
Node ID: 3
Multicast addresses: 239.192.168.108
Node addresses: 192.168.16.30
Then
# cman_tool status
Version: 6.2.0
Config Version: 51
Cluster Name: clrhev
Cluster Id: 43203
Cluster Member: Yes
Cluster Generation: 1428
Membership state: Cluster-Member
Nodes: 1
Expected votes: 9
Quorum device votes: 3
Total votes: 5
Node votes: 2
Quorum: 5
Active subsystems: 4
Flags:
Ports Bound: 0 178
Node name: intrarhev3
Node ID: 3
Multicast addresses: 239.192.168.108
Node addresses: 192.168.16.30
And startup continues up to clvmd step
In this phase, while clvmd startup hanges forever I have:
# dlm_tool ls
dlm lockspaces
name clvmd
id 0x4104eefa
flags 0x00000000
change member 1 joined 1 remove 0 failed 0 seq 1,1
members 3
# ps -ef|grep lv
root 3573 2593 0 01:05 ? 00:00:00 /bin/bash
/etc/rc3.d/S24clvmd start
root 3578 1 0 01:05 ? 00:00:00 clvmd -T30
root 3620 1 0 01:05 ? 00:00:00 /sbin/lvm pvscan
--cache --major 253 --minor 13
root 3804 3322 0 01:09 pts/0 00:00:00 grep lv
# ps -ef|grep vg
root 3601 3573 0 01:05 ? 00:00:00 /sbin/vgchange -ayl
root 3808 3322 0 01:09 pts/0 00:00:00 grep vg
# ps -ef|grep lv
root 3573 2593 0 01:05 ? 00:00:00 /bin/bash
/etc/rc3.d/S24clvmd start
root 3578 1 0 01:05 ? 00:00:00 clvmd -T30
root 4008 3322 0 01:13 pts/0 00:00:00 grep lv
# ps -ef|grep 3578
root 3578 1 0 01:05 ? 00:00:00 clvmd -T30
root 4017 3322 0 01:13 pts/0 00:00:00 grep 3578
It remains at
# service clvmd start
Starting clvmd:
Activating VG(s): 3 logical volume(s) in volume group "VG_VIRT02" now active
Is there any way to debug clvmd?
I suppose it communicates through intracluster, correct?
tcpdump output could be of any help?
Any one already passed to 6.3 (on rhel and/or CentOS) and having all
ok with clvmd?
BTW: I also tried lvmetad, that is tech preview in 6.3, enabling its
service and putting "use_lvmetad = 1" in lvm.conf but without luck...
Thanks in advance
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
--
esta es mi vida e me la vivo hasta que dios quiera
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster