Re: Linux-cluster Digest, Vol 117, Issue 6

vijay singh <vijay.240385@xxxxxxxxx> · Fri, 17 Jan 2014 23:04:33 +0530

Hi,

According to the clustat status  it seems there is some issue with the multi casting. Plz check the multicast using omping 
On Jan 17, 2014 8:04 PM,  <linux-cluster-request@xxxxxxxxxx> wrote:

Send Linux-cluster mailing list submissions to

        linux-cluster@xxxxxxxxxx

To subscribe or unsubscribe via the World Wide Web, visit

        https://www.redhat.com/mailman/listinfo/linux-cluster

or, via email, send a message with subject or body 'help' to

        linux-cluster-request@xxxxxxxxxx

You can reach the person managing the list at

        linux-cluster-owner@xxxxxxxxxx

When replying, please edit your Subject line so it is more specific

than "Re: Contents of Linux-cluster digest..."

Today's Topics:

   1. 2 node cluster questions (Benjamin Budts)

   2. Re: 2 node cluster questions (emmanuel segura)

   3. Is it gfs2 able to perform on a 24TB partition?

      (Juan Pablo Lorier)

   4. Re: Is it gfs2 able to perform on a 24TB partition?

      (Steven Whitehouse)

   5. Re: 2 node cluster questions (Benjamin Budts)

   6. Re: 2 node cluster questions (emmanuel segura)

----------------------------------------------------------------------

Message: 1

Date: Thu, 16 Jan 2014 18:04:18 +0100

From: "Benjamin Budts" <ben@xxxxxxxxxx>

To: <linux-cluster@xxxxxxxxxx>

Subject:  2 node cluster questions

Message-ID: <002801cf12dc$fe8d1e40$fba75ac0$@zentrix.be>

Content-Type: text/plain; charset="us-ascii"

Hey All,

About my setup :

I created a cluster in luci with shared storage, added 2 nodes (reachable),

added 2 fence devices (idrac).

Now, I can't seem to add my fence devices to my nodes. If I click on a node

it times out in the gui.

So I checked my node logs and found in /var/log/messages ricci hangs on

/etc/init.d/clvmd status

When I run the command manually I get the PID and it hangs.

I also seem to have a split brain config :

# clustat on node 1 shows : node 1 Online,Local

                                                    Node 2 Offline

# clustat on node 2 shows : node 1 offline

                                                    Node 2 online, Local

My next step was testing if my multicast was working correctly (I suspect it

isn't). Would you guys have any recommendations besides the following redhat

multicast test link ? :

Https://access.redhat.com/site/articles/22304

Some info about my systems  :

----------------------------------

.         Redhat 6.5 on 2 nodes with resilient storage & HA addon licenses

.         Mgmt. station redhat 6.5 with Luci

Thx a lot

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <https://www.redhat.com/archives/linux-cluster/attachments/20140116/1e1c660e/attachment.html>

------------------------------

Message: 2

Date: Thu, 16 Jan 2014 18:17:11 +0100

From: emmanuel segura <emi2fast@xxxxxxxxx>

To: linux clustering <linux-cluster@xxxxxxxxxx>

Subject: Re:  2 node cluster questions

Message-ID:

        <CAE7pJ3DGsXJx602Tr73UjFFrrYpFi686TJeKQFqybxMv+5D00A@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset="windows-1252"

If you think your problem is the multicast, try to use unicast

https://access.redhat.com/site/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Cluster_Administration/s1-unicast-traffic-CA.html,

if you are using redhat 6.X

2014/1/16 Benjamin Budts <ben@xxxxxxxxxx>

>

>

> Hey All,

>

>

>

>

>

> About my setup :

>

>

>

> I created a cluster in luci with shared storage, added 2 nodes

> (reachable), added 2 fence devices (idrac).

>

>

>

> Now, I can?t seem to add my fence devices to my nodes. If I click on a

> node it times out in the gui.

>

>

>

> So I checked my node logs and found in /var/log/messages ricci hangs on

> /etc/init.d/clvmd status

>

>

>

> When I run the command manually I get the PID and it hangs?

>

>

>

>

>

> I also seem to have a split brain config :

>

>

>

> # clustat on node 1 shows : node 1 Online,Local

>

>                                                     Node 2 Offline

>

>

>

> # clustat on node 2 shows : node 1 offline

>

>                                                     Node 2 online, Local

>

>

>

>

>

> My next step was testing if my multicast was working correctly (I suspect

> it isn?t). Would you guys have any recommendations besides the following

> redhat multicast test link ? :

>

>

>

> *Https://access.redhat.com/site/articles/22304

> <Https://access.redhat.com/site/articles/22304>*

>

>

>

>

>

>

>

> Some info about my systems  :

>

> ----------------------------------

>

>

>

> ?         Redhat 6.5 on 2 nodes with resilient storage & HA addon licenses

>

> ?         Mgmt. station redhat 6.5 with Luci

>

>

>

>

>

>

>

> Thx a lot

>

> --

> Linux-cluster mailing list

> Linux-cluster@xxxxxxxxxx

> https://www.redhat.com/mailman/listinfo/linux-cluster

>

--

esta es mi vida e me la vivo hasta que dios quiera

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <https://www.redhat.com/archives/linux-cluster/attachments/20140116/b2678815/attachment.html>

------------------------------

Message: 3

Date: Fri, 17 Jan 2014 09:10:42 -0200

From: Juan Pablo Lorier <jplorier@xxxxxxxxx>

To: linux-cluster@xxxxxxxxxx

Subject:  Is it gfs2 able to perform on a 24TB

        partition?

Message-ID: <52D90FB2.4060906@xxxxxxxxx>

Content-Type: text/plain; charset=ISO-8859-1

Hi,

I've been using gfs2 on top of a 24 TB lvm volume used as a file server

for several month now and I see a lot of io related to glock_workqueue

when there's file transfers. The threads even get to be the top ones in

read ops in iotop.

Is that normal? how can I debugg this?

Regards,

------------------------------

Message: 4

Date: Fri, 17 Jan 2014 11:16:28 +0000

From: Steven Whitehouse <swhiteho@xxxxxxxxxx>

To: Juan Pablo Lorier <jplorier@xxxxxxxxx>

Cc: linux-cluster@xxxxxxxxxx

Subject: Re:  Is it gfs2 able to perform on a 24TB

        partition?

Message-ID: <1389957388.2734.20.camel@menhir>

Content-Type: text/plain; charset="UTF-8"

Hi,

On Fri, 2014-01-17 at 09:10 -0200, Juan Pablo Lorier wrote:

> Hi,

>

> I've been using gfs2 on top of a 24 TB lvm volume used as a file server

> for several month now and I see a lot of io related to glock_workqueue

> when there's file transfers. The threads even get to be the top ones in

> read ops in iotop.

> Is that normal? how can I debugg this?

> Regards,

>

Well the glock_workqueue threads are running the internal state machine

that controls the caching of data. So depending on the workload in

question, yes it is expected that you'll see a fair amount of activity

there.

If the issue is that you are seeing the cache being used inefficiently,

then it maybe possible to improve that by looking at the i/o pattern

generated by the application. There are also other things which can make

a difference, such as setting noatime on the mount.

You don't mention what version of the kernel you are using, but more

recent kernels have better tools (such as tracepoints) which can assist

in debugging this kind of issue,

Steve.

------------------------------

Message: 5

Date: Fri, 17 Jan 2014 15:01:22 +0100

From: "Benjamin Budts" <ben@xxxxxxxxxx>

To: <linux-cluster@xxxxxxxxxx>

Subject: Re:  2 node cluster questions

Message-ID: <005e01cf138c$9a981100$cfc83300$@zentrix.be>

Content-Type: text/plain; charset="us-ascii"

Found my problem

When running cman_tool status I saw that each node  only saw 1 node.

I have 2 pairs of network on each machine (heartbeat network 1gbit multicast

enabled/application network 10Gbit) with their own host mapping in

/etc/hosts

I used the the non-heartbeat hostname to add nodes into cluster via luci

which caused issues.

Removed luci & /var/lib/luci on nodes, as well as cat /dev/null

/etc/cluster/cluster.conf

Reinstalled luci, created cluster, added 2 nodes with correct hostnames,

problem solved.

Could anyone point me to a good guide for using cluster lvm ?

Thx

From: Elvir Kuric [mailto:ekuric@xxxxxxxxxx]

Sent: vrijdag 17 januari 2014 10:42

To: ben@xxxxxxxxxx

Subject: Re:  2 node cluster questions

On 01/16/2014 06:04 PM, Benjamin Budts wrote:

Hey All,

About my setup :

I created a cluster in luci with shared storage, added 2 nodes (reachable),

added 2 fence devices (idrac).

Now, I can't seem to add my fence devices to my nodes. If I click on a node

it times out in the gui.

So I checked my node logs and found in /var/log/messages ricci hangs on

/etc/init.d/clvmd status

When I run the command manually I get the PID and it hangs.

I also seem to have a split brain config :

# clustat on node 1 shows : node 1 Online,Local

                                                    Node 2 Offline

# clustat on node 2 shows : node 1 offline

                                                    Node 2 online, Local

My next step was testing if my multicast was working correctly (I suspect it

isn't). Would you guys have any recommendations besides the following redhat

multicast test link ? :

Https://access.redhat.com/site/articles/22304

Some info about my systems  :

----------------------------------

.         Redhat 6.5 on 2 nodes with resilient storage & HA addon licenses

.         Mgmt. station redhat 6.5 with Luci

Thx a lot

Hard to say what issue is based on this without checking fully logs, can you

please open case via Red Hat customer portal https://access.redhat.com ( I

assume you have valid subscription ) so we can take deeper look into this

problem.

Thank you in advance,

Kind regards,

--

Elvir Kuric,Sr. TSE / Red Hat / GSS EMEA /

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <https://www.redhat.com/archives/linux-cluster/attachments/20140117/6d673d84/attachment.html>

------------------------------

Message: 6

Date: Fri, 17 Jan 2014 15:27:42 +0100

From: emmanuel segura <emi2fast@xxxxxxxxx>

To: linux clustering <linux-cluster@xxxxxxxxxx>

Subject: Re:  2 node cluster questions

Message-ID:

        <CAE7pJ3Bi7woB42pD_dCS0xBrAs8uMgGCZCmfc9GyBzk3bwNr-A@xxxxxxxxxxxxxx>

Content-Type: text/plain; charset="windows-1252"

https://access.redhat.com/site/documentation/it-IT/Red_Hat_Enterprise_Linux/6/html/Logical_Volume_Manager_Administration/LVM_Cluster_Overview.html

2014/1/17 Benjamin Budts <ben@xxxxxxxxxx>

>

>

> Found my problem

>

>

>

> When running cman_tool status I saw that each node  only saw 1 node.

>

> I have 2 pairs of network on each machine (heartbeat network 1gbit

> multicast enabled/application network 10Gbit) with their own host mapping

> in /etc/hosts

>

>

>

> I used the the non-heartbeat hostname to add nodes into cluster via luci

> which caused issues.

>

>

>

> Removed luci & /var/lib/luci on nodes, as well as cat /dev/null

> /etc/cluster/cluster.conf

>

> Reinstalled luci, created cluster, added 2 nodes with correct hostnames,

> problem solved.

>

>

>

> Could anyone point me to a good guide for using cluster lvm ?

>

>

>

> Thx

>

>

>

>

>

>

>

> *From:* Elvir Kuric [mailto:ekuric@xxxxxxxxxx]

> *Sent:* vrijdag 17 januari 2014 10:42

> *To:* ben@xxxxxxxxxx

> *Subject:* Re:  2 node cluster questions

>

>

>

> On 01/16/2014 06:04 PM, Benjamin Budts wrote:

>

>

>

> Hey All,

>

>

>

>

>

> About my setup :

>

>

>

> I created a cluster in luci with shared storage, added 2 nodes

> (reachable), added 2 fence devices (idrac).

>

>

>

> Now, I can?t seem to add my fence devices to my nodes. If I click on a

> node it times out in the gui.

>

>

>

> So I checked my node logs and found in /var/log/messages ricci hangs on

> /etc/init.d/clvmd status

>

>

>

> When I run the command manually I get the PID and it hangs?

>

>

>

>

>

> I also seem to have a split brain config :

>

>

>

> # clustat on node 1 shows : node 1 Online,Local

>

>                                                     Node 2 Offline

>

>

>

> # clustat on node 2 shows : node 1 offline

>

>                                                     Node 2 online, Local

>

>

>

>

>

> My next step was testing if my multicast was working correctly (I suspect

> it isn?t). Would you guys have any recommendations besides the following

> redhat multicast test link ? :

>

>

>

> *Https://access.redhat.com/site/articles/22304

> <Https://access.redhat.com/site/articles/22304>*

>

>

>

>

>

>

>

> Some info about my systems  :

>

> ----------------------------------

>

>

>

> ?         Redhat 6.5 on 2 nodes with resilient storage & HA addon licenses

>

> ?         Mgmt. station redhat 6.5 with Luci

>

>

>

>

>

>

>

> Thx a lot

>

>

>

> Hard to say what issue is based on this without checking fully logs, can

> you please open case via Red Hat customer portal https://access.redhat.com( I assume you have valid subscription ) so we can take deeper look into

> this problem.

>

> Thank you in advance,

>

> Kind regards,

>

>

>

> --

>

> Elvir Kuric,Sr. TSE / Red Hat / GSS EMEA /

>

>

> --

> Linux-cluster mailing list

> Linux-cluster@xxxxxxxxxx

> https://www.redhat.com/mailman/listinfo/linux-cluster

>

--

esta es mi vida e me la vivo hasta que dios quiera

-------------- next part --------------

An HTML attachment was scrubbed...

URL: <https://www.redhat.com/archives/linux-cluster/attachments/20140117/6fc1d7e7/attachment.html>

------------------------------

--

Linux-cluster mailing list

Linux-cluster@xxxxxxxxxx

https://www.redhat.com/mailman/listinfo/linux-cluster

End of Linux-cluster Digest, Vol 117, Issue 6

*********************************************

-- 
Linux-cluster mailing list
Linux-cluster@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/linux-cluster