Re: new cluster acting odd

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Dec 1, 2014, at 13:24, Digimer <lists@xxxxxxxxxx> wrote:
> GFS2, like any clustered filesystem, requires cluster locking. This locking comes with a non-trivial overhead. Exporting NFS allows you to avoid this bottle-neck and with a simple 2-node cluster behind the scenes, you maintain full HA.

We have a few small GFS2 file systems and one largish (2TB) one.  The small ones are fine, the large one is a pain.  We're in the process of converting the large one to XFS with NFS (the backend for this is all iSCSI devices).  For our application, NFSv4 makes this possible, as it provides much better consistency properties than the previous versions.

>>> On 01/12/14 11:56 AM, Megan . wrote:
>>>> fence_tool dump worked on one of my nodes, but it is just hanging on the
>>>> rest.

Well, that's not good.  I'd have to look at the fence tool source to even figure out how it could be blocking.

>>>> 
>>>> [root@map1-uat ~]# fence_tool dump
>>>> 1417448610 logging mode 3 syslog f 160 p 6 logfile p 6
>>>> /var/log/cluster/fenced.log
>>>> 1417448610 fenced 3.0.12.1 started
>>>> 1417448610 connected to dbus :1.12
>>>> 1417448610 cluster node 1 added seq 89048
>>>> 1417448610 cluster node 2 added seq 89048
>>>> 1417448610 cluster node 3 added seq 89048
>>>> 1417448610 cluster node 4 added seq 89048
>>>> 1417448610 cluster node 5 added seq 89048
>>>> 1417448610 cluster node 6 added seq 89048
>>>> 1417448610 cluster node 8 added seq 89048
>>>> 1417448610 our_nodeid 4 our_name map1-uat.project.domain.com
>>>> 1417448611 logging mode 3 syslog f 160 p 6 logfile p 6
>>>> /var/log/cluster/fenced.log
>>>> 1417448611 logfile cur mode 100644
>>>> 1417448611 cpg_join fenced:daemon ...
>>>> 1417448621 daemon cpg_join error retrying
>>>> 1417448631 daemon cpg_join error retrying
>>>> 1417448641 daemon cpg_join error retrying
>>>> 1417448651 daemon cpg_join error retrying
>>>> 1417448661 daemon cpg_join error retrying
>>>> 1417448671 daemon cpg_join error retrying
>>>> 1417448681 daemon cpg_join error retrying
>>>> 1417448691 daemon cpg_join error retrying

And that looks the fence group is failing the membership transition.  Nothing else will work properly if the fence group is busted.

-dan


-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster




[Index of Archives]     [Corosync Cluster Engine]     [GFS]     [Linux Virtualization]     [Centos Virtualization]     [Centos]     [Linux RAID]     [Fedora Users]     [Fedora SELinux]     [Big List of Linux Books]     [Yosemite Camping]

  Powered by Linux