What does this error mean?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Short answers - yes all on the same subnet.
Every host can ping the others
Iptables shows empty entries for all filters

Details are here - http://pastebin.com/eKtRMbGE

I did explicitly turn the iptables off again, and then checked again:

jc1letgfs5
Firewall is stopped.

jc1letgfs6
Firewall is stopped.

jc1letgfs7
Firewall is stopped.

jc1letgfs8
Firewall is stopped.

Thanks,

James

-----Original Message-----
From: Mohit Anchlia [mailto:mohitanchlia at gmail.com] 
Sent: Monday, March 21, 2011 2:25 PM
To: Burnash, James
Cc: gluster-users at gluster.org
Subject: Re: What does this error mean?

Are they in same subnet? What happens if you ping these hosts
individually? Do they ping?

I closely looked at the error you posted and "connection to
10.20.72.157:24007 failed (No route to host" points to either firewall
issue or could be a switch issue on the network. Ping test on each
host to each other will be helpful.

Can you post results of ping and also "service iptables status" from each node?

On Mon, Mar 21, 2011 at 11:16 AM, Burnash, James <jburnash at knight.com> wrote:
> A little more information:
>
> From the original (first peer node):
> root at jc1letgfs5:/etc/glusterd/vols# gluster peer status
> Number of Peers: 3
>
> Hostname: jc1letgfs6
> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
> State: Peer in Cluster (Disconnected)
>
> Hostname: jc1letgfs7
> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
> State: Peer in Cluster (Connected)
>
> Hostname: jc1letgfs8
> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
> State: Peer in Cluster (Connected)
>
>
> From the problem node:
> *** NOTE - only one Peer seen
> root at jc1letgfs6:~# gluster peer status
> Number of Peers: 1
>
> Hostname: 10.20.72.156
> Uuid: 95e1d79a-632a-4774-9d7e-a7234cb084ca
> State: Peer in Cluster (Connected)
>
>
> From a different peer node:
> root at jc1letgfs8:~# gluster peer status
> Number of Peers: 3
>
> Hostname: jc1letgfs6
> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
> State: Peer Rejected (Connected)
>
> Hostname: jc1letgfs7
> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
> State: Peer in Cluster (Connected)
>
> Hostname: 10.20.72.156
> Uuid: 95e1d79a-632a-4774-9d7e-a7234cb084ca
> State: Peer in Cluster (Connected)
>
> -----Original Message-----
> From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Burnash, James
> Sent: Monday, March 21, 2011 2:05 PM
> To: Mohit Anchlia
> Cc: gluster-users at gluster.org
> Subject: Re: What does this error mean?
>
> I did do this, and noting in particular stands out.
>
> I'll exercise it some more, and see if we can get something that will at least point in the proper direction.
>
> I suspect that another reboot of the affected machine will fix this condition - but it won't help me understand the root problem the next time this happens.
>
> Thanks,
>
> James
>
> -----Original Message-----
> From: Mohit Anchlia [mailto:mohitanchlia at gmail.com]
> Sent: Monday, March 21, 2011 12:40 PM
> To: Burnash, James
> Cc: gluster-users at gluster.org
> Subject: Re: What does this error mean?
>
> Can you turn on DEBUG and see if there is something that stands out?
>
> On Mon, Mar 21, 2011 at 9:34 AM, Burnash, James <jburnash at knight.com> wrote:
>> Does anybody have any clue as to why this is happening? The problem has persisted for several days now, but I can't find anything at all in the logs to possibly explain why this is so.
>>
>> -----Original Message-----
>> From: gluster-users-bounces at gluster.org [mailto:gluster-users-bounces at gluster.org] On Behalf Of Burnash, James
>> Sent: Wednesday, March 16, 2011 9:10 AM
>> To: gluster-users at gluster.org
>> Subject: [SPAM?] What does this error mean?
>> Importance: Low
>>
>> Hello.
>>
>> After purposely crashing (via ' echo b>/proc/sysrq-trigger ) node jc1letgfs6 to test mirroring, even after the node has rebooted and is back online I am still seeing the statement "Disconnected" for that node when I execute the following command on the first storage node:
>>
>> root at jc1letgfs5:/etc/glusterd/vols# gluster peer status
>> Number of Peers: 3
>>
>> Hostname: jc1letgfs6
>> Uuid: cd590fad-022c-4b9a-97f5-3262080d772d
>> State: Peer in Cluster (Disconnected)
>>
>> Hostname: jc1letgfs7
>> Uuid: c5f40de4-9bb1-47ad-93b6-d52c6689ee29
>> State: Peer in Cluster (Disconnected)
>>
>> Hostname: jc1letgfs8
>> Uuid: 13f4ce3f-042e-4144-a76c-d2b1b91676bd
>> State: Peer in Cluster (Connected)
>>
>> This is running on 4 servers with CentOS 5.5 (x86_64), GlusterFS 3.1.1
>>
>> Here is the volume info:
>>
>> # gluster volume info
>>
>> Volume Name: test-pfs-ro1
>> Type: Distributed-Replicate
>> Status: Started
>> Number of Bricks: 4 x 2 = 8
>> Transport-type: tcp
>> Bricks:
>> Brick1: jc1letgfs5:/export/read-only/g01
>> Brick2: jc1letgfs6:/export/read-only/g01
>> Brick3: jc1letgfs5:/export/read-only/g02
>> Brick4: jc1letgfs6:/export/read-only/g02
>> Brick5: jc1letgfs7:/export/read-only/g01
>> Brick6: jc1letgfs8:/export/read-only/g01
>> Brick7: jc1letgfs7:/export/read-only/g02
>> Brick8: jc1letgfs8:/export/read-only/g02
>> Options Reconfigured:
>> performance.stat-prefetch: on
>> performance.cache-size: 2GB
>> network.ping-timeout: 10
>>
>> Even with this error, mirroring functions as expected, and the node is recognized and utilized, as can be seen in this log fragment from jc1letgfs5: /var/log/glusterfs/etc-glusterfs-glusterd.vol.log
>>
>> [2011-03-13 23:51:31.458329] E [socket.c:1656:socket_connect_finish] management: connection to 10.20.72.157:24007 failed (No route to ho
>> st)
>> [2011-03-13 23:53:49.42170] I [glusterd3_1-mops.c:172:glusterd3_1_friend_add_cbk] glusterd: Received ACC from uuid: cd590fad-022c-4b9a-9
>> 7f5-3262080d772d, host: jc1letgfs6, port: 0
>> [2011-03-13 23:53:49.42204] I [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend found.. state: Peer in Cluster
>> [2011-03-13 23:53:49.42320] I [glusterd-utils.c:2062:glusterd_friend_find_by_uuid] glusterd: Friend found.. state: Peer in Cluster
>> [2011-03-13 23:53:49.42336] I [glusterd-handler.c:2267:glusterd_handle_friend_update] glusterd: Received friend update from uuid: cd590f
>> ad-022c-4b9a-97f5-3262080d772d
>> [2011-03-13 23:53:49.42359] I [glusterd-handler.c:2312:glusterd_handle_friend_update] : Received uuid: 95e1d79a-632a-4774-9d7e-a7234cb08
>> 4ca, hostname:10.20.72.156
>> [2011-03-13 23:53:49.42412] I [glusterd-handler.c:2315:glusterd_handle_friend_update] : Received my uuid as Friend
>>
>>
>> Any pointers or help would be appreciated.
>>
>> James Burnash, Unix Engineering
>>
>>
>> DISCLAIMER:
>> This e-mail, and any attachments thereto, is intended only for use by the addressee(s) named herein and may contain legally privileged and/or confidential information. If you are not the intended recipient of this e-mail, you are hereby notified that any dissemination, distribution or copying of this e-mail, and any attachments thereto, is strictly prohibited. If you have received this in error, please immediately notify me and permanently delete the original and any copy of any e-mail and any printout thereof. E-mail transmission cannot be guaranteed to be secure or error-free. The sender therefore does not accept liability for any errors or omissions in the contents of this message which arise as a result of e-mail transmission.
>> NOTICE REGARDING PRIVACY AND CONFIDENTIALITY Knight Capital Group may, at its discretion, monitor and review the content of all e-mail communications. http://www.knight.com
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>> _______________________________________________
>> Gluster-users mailing list
>> Gluster-users at gluster.org
>> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>>
> _______________________________________________
> Gluster-users mailing list
> Gluster-users at gluster.org
> http://gluster.org/cgi-bin/mailman/listinfo/gluster-users
>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux