Re: cLVM: LVM commands take severl minutes to complete

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



11.09.2015 17:02, Daniel Dehennin wrote:
Hello,

On a two node cluster Ubuntu Trusty:

- Linux nebula3 3.13.0-63-generic #103-Ubuntu SMP Fri Aug 14 21:42:59
   UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

- corosync 2.3.3-1ubuntu1

- pacemaker 1.1.10+git20130802-1ubuntu2.3

- dlm 4.0.1-0ubuntu1

- clvm 2.02.98-6ubuntu2

You need newer version of this^

2.02.102 is known to include commit 431eda6 without which cluster is unusable in degraded state (and even if one node is put to standby state).

You see timeouts with two nodes online, so that is the different issue, but that above will not hurt.


- gfs2-utils 3.1.6-0ubuntu1


The LVM commands take minutes to complete:

     root@nebula3:~# time vgs
       Error locking on node 40a8e784: Command timed out
       Error locking on node 40a8e784: Command timed out
       Error locking on node 40a8e784: Command timed out
       VG             #PV #LV #SN Attr   VSize    VFree
       nebula3-vg       1   4   0 wz--n-  133,52g       0
       one-fs           1   1   0 wz--nc    2,00t       0
       one-production   1   0   0 wz--nc 1023,50g 1023,50g

     real    5m40.233s
     user    0m0.005s
     sys     0m0.018s

Do you know where I can look to find what's going on?

Here are some informations:

     root@nebula3:~# corosync-quorumtool
     Quorum information
     ------------------
     Date:             Fri Sep 11 15:57:17 2015
     Quorum provider:  corosync_votequorum
     Nodes:            2
     Node ID:          1084811139
     Ring ID:          1460
     Quorate:          Yes

     Votequorum information
     ----------------------
     Expected votes:   2
     Highest expected: 2
     Total votes:      2
     Quorum:           1
     Flags:            2Node Quorate WaitForAll LastManStanding

Better use two_node: 1 in votequorum section.
That implies wait_for_all and supersedes last_man_standing for two-node clusters.

I'd also recommend to set clear_node_high_bit in totem section, do you use it?

But even better is to add nodelist section to corosync.conf with manually specified nodeid's.

Everything else looks fine...


     Membership information
     ----------------------
         Nodeid      Votes Name
     1084811139          1 192.168.231.131 (local)
     1084811140          1 192.168.231.132


     root@nebula3:~# dlm_tool ls
     dlm lockspaces
     name          datastores
     id            0x1b61ba6a
     flags         0x00000000
     change        member 2 joined 1 remove 0 failed 0 seq 1,1
     members       1084811139 1084811140

     name          clvmd
     id            0x4104eefa
     flags         0x00000000
     change        member 2 joined 1 remove 0 failed 0 seq 1,1
     members       1084811139 1084811140


     root@nebula3:~# dlm_tool status
     cluster nodeid 1084811139 quorate 1 ring seq 1460 1460
     daemon now 11026 fence_pid 0
     node 1084811139 M add 455 rem 0 fail 0 fence 0 at 0 0
     node 1084811140 M add 455 rem 0 fail 0 fence 0 at 0 0


Regards.




--
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster



[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux