Gluster Failing after a period of time

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi everyone,

New to the list, and hope I can get some help.

I have configured Gluster in a AWS environment using 2 nodes. (servers/instances)
I have a website that is running 20 plus apache instances. They are all connection to a NFS server for their code base.

This past weekend we did a migration from NFS to Gluster. We are using the Gluster client on each apache instance to connect to the cluster.

Everything worked fine, and on each apache server I could browse the mount to Gluster no problem with fine speed. It ran fine for about 7 hours, then apache started to fail.

After logging into apache, the mounts to the cluster were still working, but we a bit slow while trying to do a "ls" on the dir

I then shut down apache, umounted gluster, remounted gluster, and started apache.... It ran fine for another 10 to 20 minutes, then apache started to fail again.

I assume the reason for the failure "could" be load, but Gluster and apache were not taxed at all.... My gut is telling me network, and also I am seeing this in the logs during the time of the issue:

[socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (76.226.144.165:1023)
[2013-08-18 15:43:07.479124] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (109.22.29.1281023)
[2013-08-18 15:43:07.506516] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (168.129.133.122:1023)
[2013-08-18 15:43:07.531118] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (76.16.48.127:1023)
[2013-08-18 15:43:07.564645] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (67.226.67.69:1023)
[2013-08-18 15:43:07.569733] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (175.129.46.148:1023)
[2013-08-18 15:43:07.586239] W [socket.c:1494:__socket_proto_state_machine] 0-socket.management: reading from socket failed. Error (Transport endpoint is not connected), peer (53.235.83.220:1023)

/var/log/glusterfs/etc-glusterfs-glusterd.vol.log

Thanks all in advance!

Dan


--
Dan Belkie
Shelter Six Technologies Inc.
403.397.4491
http://www.sheltersix.com<http://www.sheltersix.com/>
dbelkie at sheltersix.com<mailto:dbelkie at sheltersix.com>
Skype: dbelkie

[Logo CONVERT TO CURVES]

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130819/a0d2c2ab/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.png
Type: image/png
Size: 13889 bytes
Desc: image001.png
URL: <http://supercolony.gluster.org/pipermail/gluster-users/attachments/20130819/a0d2c2ab/attachment-0001.png>


[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux