The logs from the recovering node are attached. If you need the same from the other node I will get them tonight. On Sep 2, 2014, at 12:42 PM, David Teigland <teigland@xxxxxxxxxx> wrote: > We need to sort out which nodes are sending/receiving plock data to/from > each other. The way it's supposed to work, is an existing node is > supposed to write its plock data into a checkpoint, then do > send_plocks_stored() to notify the new node that the data is ready. The > new node is then supposed to receive_plocks_stored(), and read the plock > data from the checkpoint. > > I could get a better picture if you save and send the output of > dlm_tool dump > dlm_dump.txt > dlm_tool log_plock > dlm_plock.txt > > after the problem occurs.
1409631949 clvmd prepare_plocks 1409631949 clvmd receive_plocks_stored 2:5 flags a sig 0 need_plocks 1 1409631951 lvclusdidiz0360 prepare_plocks 1409631951 lvclusdidiz0360 receive_plocks_stored 2:6 flags a sig 2f6b need_plocks 1 1409634840 lvclusdidiz0360 prepare_plocks 1409634840 lvclusdidiz0360 receive_plocks_stored 2:8 flags a sig 2f6b need_plocks 1 1409634840 lvclusdidiz0360 retrieve_plocks first 0 last 0 r_count 0 p_count 0 sig 0
1409631946 logging mode 3 syslog f 160 p 6 logfile p 6 /var/log/cluster/dlm_controld.log 1409631946 dlm_controld 3.0.12.1 started 1409631946 logging mode 3 syslog f 160 p 4 logfile p 4 /var/log/cluster/dlm_controld.log 1409631946 found /dev/misc/dlm-control minor 58 1409631946 found /dev/misc/dlm-monitor minor 57 1409631946 found /dev/misc/dlm_plock minor 56 1409631946 /dev/misc/dlm-monitor fd 12 1409631946 /sys/kernel/config/dlm/cluster/comms: opendir failed: 2 1409631946 /sys/kernel/config/dlm/cluster/spaces: opendir failed: 2 1409631946 cluster node 1 added seq 2244 1409631946 set_configfs_node 1 192.168.7.53 local 1 1409631946 cluster node 2 added seq 2244 1409631946 set_configfs_node 2 192.168.7.54 local 0 1409631946 totem/rrp_mode = 'none' 1409631946 set protocol 0 1409631946 group_mode 3 compat 0 1409631946 setup_cpg_daemon 15 1409631946 dlm:controld conf 2 1 0 memb 1 2 join 1 left 1409631946 run protocol from nodeid 2 1409631946 daemon run 1.1.1 max 1.1.1 kernel run 1.1.1 max 1.1.1 1409631946 plocks 18 1409631946 plock cpg message size: 104 bytes 1409631947 client connection 5 fd 19 1409631949 uevent: add@/kernel/dlm/clvmd 1409631949 kernel: add@ clvmd 1409631949 uevent: online@/kernel/dlm/clvmd 1409631949 kernel: online@ clvmd 1409631949 dlm:ls:clvmd conf 2 1 0 memb 1 2 join 1 left 1409631949 clvmd add_change cg 1 joined nodeid 1 1409631949 clvmd add_change cg 1 we joined 1409631949 clvmd add_change cg 1 counts member 2 joined 1 remove 0 failed 0 1409631949 clvmd check_fencing done 1409631949 clvmd check_quorum disabled 1409631949 clvmd check_fs none registered 1409631949 clvmd send_start cg 1 flags 1 data2 0 counts 0 2 1 0 0 1409631949 clvmd receive_start 1:1 len 80 1409631949 clvmd match_change 1:1 matches cg 1 1409631949 clvmd wait_messages cg 1 need 1 of 2 1409631949 clvmd receive_start 2:5 len 80 1409631949 clvmd match_change 2:5 matches cg 1 1409631949 clvmd wait_messages cg 1 got all 2 1409631949 clvmd start_kernel cg 1 member_count 2 1409631949 write "1090842362" to "/sys/kernel/dlm/clvmd/id" 1409631949 set_members mkdir "/sys/kernel/config/dlm/cluster/spaces/clvmd/nodes/1" 1409631949 set_members mkdir "/sys/kernel/config/dlm/cluster/spaces/clvmd/nodes/2" 1409631949 write "1" to "/sys/kernel/dlm/clvmd/control" 1409631949 write "0" to "/sys/kernel/dlm/clvmd/event_done" 1409631949 clvmd set_plock_ckpt_node from 0 to 2 1409631949 clvmd receive_plocks_stored 2:5 flags a sig 0 need_plocks 1 1409631949 clvmd match_change 2:5 matches cg 1 1409631949 clvmd retrieve_plocks 1409631949 retrieve_plocks ckpt open error 12 clvmd 1409631949 lockspace clvmd plock disabled our sig 816fb301 nodeid 2 sig 0 1409631949 uevent: add@/devices/virtual/misc/dlm_clvmd 1409631951 uevent: add@/kernel/dlm/lvclusdidiz0360 1409631951 kernel: add@ lvclusdidiz0360 1409631951 uevent: online@/kernel/dlm/lvclusdidiz0360 1409631951 kernel: online@ lvclusdidiz0360 1409631951 dlm:ls:lvclusdidiz0360 conf 2 1 0 memb 1 2 join 1 left 1409631951 lvclusdidiz0360 add_change cg 1 joined nodeid 1 1409631951 lvclusdidiz0360 add_change cg 1 we joined 1409631951 lvclusdidiz0360 add_change cg 1 counts member 2 joined 1 remove 0 failed 0 1409631951 lvclusdidiz0360 check_fencing done 1409631951 lvclusdidiz0360 check_quorum disabled 1409631951 lvclusdidiz0360 check_fs done 1409631951 lvclusdidiz0360 send_start cg 1 flags 1 data2 0 counts 0 2 1 0 0 1409631951 lvclusdidiz0360 receive_start 1:1 len 80 1409631951 lvclusdidiz0360 match_change 1:1 matches cg 1 1409631951 lvclusdidiz0360 wait_messages cg 1 need 1 of 2 1409631951 lvclusdidiz0360 receive_start 2:6 len 80 1409631951 lvclusdidiz0360 match_change 2:6 matches cg 1 1409631951 lvclusdidiz0360 wait_messages cg 1 got all 2 1409631951 lvclusdidiz0360 start_kernel cg 1 member_count 2 1409631951 write "1723768787" to "/sys/kernel/dlm/lvclusdidiz0360/id" 1409631951 set_members mkdir "/sys/kernel/config/dlm/cluster/spaces/lvclusdidiz0360/nodes/1" 1409631951 set_members mkdir "/sys/kernel/config/dlm/cluster/spaces/lvclusdidiz0360/nodes/2" 1409631951 write "1" to "/sys/kernel/dlm/lvclusdidiz0360/control" 1409631951 write "0" to "/sys/kernel/dlm/lvclusdidiz0360/event_done" 1409631951 lvclusdidiz0360 set_plock_ckpt_node from 0 to 2 1409631951 lvclusdidiz0360 receive_plocks_stored 2:6 flags a sig 2f6b need_plocks 1 1409631951 lvclusdidiz0360 match_change 2:6 matches cg 1 1409631951 lvclusdidiz0360 retrieve_plocks 1409631951 retrieve_plocks ckpt open error 12 lvclusdidiz0360 1409631951 lockspace lvclusdidiz0360 plock disabled our sig 816fba01 nodeid 2 sig 2f6b 1409634420 uevent: offline@/kernel/dlm/lvclusdidiz0360 1409634420 kernel: offline@ lvclusdidiz0360 1409634420 dlm:ls:lvclusdidiz0360 conf 1 0 1 memb 2 join left 1 1409634420 lvclusdidiz0360 confchg for our leave 1409634420 lvclusdidiz0360 stop_kernel cg 0 1409634420 write "0" to "/sys/kernel/dlm/lvclusdidiz0360/control" 1409634420 dir_member 2 1409634420 dir_member 1 1409634420 set_members rmdir "/sys/kernel/config/dlm/cluster/spaces/lvclusdidiz0360/nodes/2" 1409634420 set_members rmdir "/sys/kernel/config/dlm/cluster/spaces/lvclusdidiz0360/nodes/1" 1409634420 set_members lockspace rmdir "/sys/kernel/config/dlm/cluster/spaces/lvclusdidiz0360" 1409634420 write "0" to "/sys/kernel/dlm/lvclusdidiz0360/event_done" 1409634420 uevent: remove@/kernel/dlm/lvclusdidiz0360 1409634420 kernel: remove@ lvclusdidiz0360 1409634840 uevent: add@/kernel/dlm/lvclusdidiz0360 1409634840 kernel: add@ lvclusdidiz0360 1409634840 uevent: online@/kernel/dlm/lvclusdidiz0360 1409634840 kernel: online@ lvclusdidiz0360 1409634840 dlm:ls:lvclusdidiz0360 conf 2 1 0 memb 1 2 join 1 left 1409634840 lvclusdidiz0360 add_change cg 1 joined nodeid 1 1409634840 lvclusdidiz0360 add_change cg 1 we joined 1409634840 lvclusdidiz0360 add_change cg 1 counts member 2 joined 1 remove 0 failed 0 1409634840 lvclusdidiz0360 check_fencing done 1409634840 lvclusdidiz0360 check_quorum disabled 1409634840 lvclusdidiz0360 check_fs done 1409634840 lvclusdidiz0360 send_start cg 1 flags 1 data2 0 counts 0 2 1 0 0 1409634840 lvclusdidiz0360 receive_start 1:1 len 80 1409634840 lvclusdidiz0360 match_change 1:1 matches cg 1 1409634840 lvclusdidiz0360 wait_messages cg 1 need 1 of 2 1409634840 lvclusdidiz0360 receive_start 2:8 len 80 1409634840 lvclusdidiz0360 match_change 2:8 matches cg 1 1409634840 lvclusdidiz0360 wait_messages cg 1 got all 2 1409634840 lvclusdidiz0360 start_kernel cg 1 member_count 2 1409634840 write "1723768787" to "/sys/kernel/dlm/lvclusdidiz0360/id" 1409634840 set_members mkdir "/sys/kernel/config/dlm/cluster/spaces/lvclusdidiz0360/nodes/1" 1409634840 set_members mkdir "/sys/kernel/config/dlm/cluster/spaces/lvclusdidiz0360/nodes/2" 1409634840 write "1" to "/sys/kernel/dlm/lvclusdidiz0360/control" 1409634840 write "0" to "/sys/kernel/dlm/lvclusdidiz0360/event_done" 1409634840 lvclusdidiz0360 set_plock_ckpt_node from 0 to 2 1409634840 lvclusdidiz0360 receive_plocks_stored 2:8 flags a sig 2f6b need_plocks 1 1409634840 lvclusdidiz0360 match_change 2:8 matches cg 1 1409634840 lvclusdidiz0360 retrieve_plocks 1409634840 lvclusdidiz0360 retrieve_plocks first 0 last 0 r_count 0 p_count 0 sig 0 1409634840 lockspace lvclusdidiz0360 plock disabled our sig 0 nodeid 2 sig 2f6b
-- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster