still can't make the qdisk working - after 5-10 seconds of qdiskd start
it locks awaiting some communication:
some more debugging of this issue:
the strace of qdiskd when getting locked:
lseek(6, 65536, SEEK_SET) = 65536
read(6, "\36\273\336\0`\224\213P\2265\v\3P\0\0\0\0\0\0\0\0\0\0\0"...,
512) = 512
select(4, [3], NULL, NULL, {0, 0}) = 0 (Timeout)
writev(3, [{"NAMC\3\0\0\20\30\0\0\0\267\0\0\200\0\0\0\0", 20},
{"\1\0\0\0", 4}], 2) = 24
recvfrom(3,
and gdb stack trace in locked state
#0 0x00007f2e642adc75 in recv () from /lib64/libpthread.so.0
#1 0x00007f2e643bba81 in cman_dispatch (handle=0x50e010, flags=26) at
/usr/src/packages/BUILD/cluster-2.03.09/cman/lib/libcman.c:501
#2 0x00007f2e643bbc7b in info_call (h=0x50e010, msgtype=<value
optimized out>, inbuf=<value optimized out>, inlen=<value optimized
out>, outbuf=0x0, outlen=0)
at /usr/src/packages/BUILD/cluster-2.03.09/cman/lib/libcman.c:59
#3 0x00007f2e643bc07a in cman_poll_quorum_device (handle=0x6,
isavailable=1) at
/usr/src/packages/BUILD/cluster-2.03.09/cman/lib/libcman.c:1016
#4 0x0000000000406d6f in quorum_loop (ctx=0x7fff6c5d5a10,
ni=0x7fff6c5d5110, max=16) at
/usr/src/packages/BUILD/cluster-2.03.09/cman/qdisk/main.c:985
#5 0x0000000000407b50 in main (argc=<value optimized out>,
argv=0x7fff6c5d6508) at
/usr/src/packages/BUILD/cluster-2.03.09/cman/qdisk/main.c:1540
what can be wrong?
thanks stepan
Stepan Kadlec wrote:
hi,
I am running cluster 2.03.08.
after adding qdisk feature to twonode cluster, it somehow locks entire
cluster. without qdisk it runs ok.
initialization log:
Nov 21 15:15:07 xen01 ccsd[15178]: Starting ccsd 2.03.08:
Nov 21 15:15:07 xen01 ccsd[15178]: Built: Nov 18 2008 14:18:19
Nov 21 15:15:07 xen01 ccsd[15178]: Copyright (C) Red Hat, Inc.
2004-2008 All rights reserved.
Nov 21 15:15:07 xen01 ccsd[15178]: IP Protocol:: IPv4 only
Nov 21 15:15:07 xen01 ccsd[15178]: /etc/cluster/cluster.conf (cluster
name = xen, version = 1) found.
Nov 21 15:15:10 xen01 ccsd[15178]: Initial status:: Inquorate
Nov 21 15:15:22 xen01 qdiskd[15202]: <debug> 0 heuristics loaded
Nov 21 15:15:22 xen01 qdiskd[15202]: <debug> Quorum Daemon: 0
heuristics, 1 interval, 10 tko, 1 votes
Nov 21 15:15:22 xen01 qdiskd[15202]: <debug> Run Flags: 00000031
Nov 21 15:15:22 xen01 qdiskd[15202]: <info> Quorum Partition:
/dev/disk/by-id/scsi-360a9800068706952464a4b544c704271-part2 Label: xen
Nov 21 15:15:22 xen01 qdiskd[15203]: <info> Quorum Daemon Initializing
Nov 21 15:15:22 xen01 qdiskd[15203]: <debug> I/O Size: 512 Page Size: 4096
Nov 21 15:15:22 xen01 qdiskd[15203]: <debug> Permanently setting score
to 1/1
Nov 21 15:15:22 xen01 kernel: dlm: closing connection to node 2
Nov 21 15:15:22 xen01 kernel: dlm: closing connection to node 1
Nov 21 15:15:25 xen01 qdiskd[15203]: <debug> Node 2 is UP
Nov 21 15:15:32 xen01 qdiskd[15203]: <info> Initial score 1/1
Nov 21 15:15:32 xen01 qdiskd[15203]: <info> Initialization complete
Nov 21 15:15:32 xen01 qdiskd[15203]: <notice> Score sufficient for
master operation (1/1; required=1); upgrading
Nov 21 15:15:34 xen01 qdiskd[15203]: <debug> Making bid for master
Nov 21 15:15:38 xen01 qdiskd[15203]: <info> Assuming master role
after this, all cluster tools just hang - cman_tool nodes, clustat, ...
and cluster processes are in locked state:
13124 ? Ssl 0:00 /sbin/ccsd -4
13129 ? SLl 0:00 aisexec
13154 ? Ss 0:00 /sbin/groupd
13157 ? SLs 0:00 /sbin/qdiskd -Q
13162 ? Ss 0:00 /sbin/fenced
13167 ? Ss 0:00 /sbin/dlm_controld
any ideas howto fix that?
thanks stepan.
--
Eurosoftware s.r.o.
skadlec@xxxxxxxxxxxxxxx
+420 379 307 379
+420 724 554 104
--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster