I believe you'll find that this works in the tla repository (2.4; 2.5 is
significantly different code), which has a few patches beyond pre4.
On Thu, 21 Jun 2007, Daniel wrote:
1.30-pre4
afr across 2 servers
servers are io-streams, no write back no read forward
TCP on a Gigabit network
We setup a stresstest script to test the client using php and about 36
instances of the script, and occasionally we get a "transport end point not
connected" which kills all of the instances (intentionally, they halt on
error, but it means the mount went stale), but without any intervention
gluster picks up again and seems to operate fine when we re-run the scripts
we're pushing roughly 300 writes a second in the test
the only debug info in the log is the following:
[Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[Jun 21 19:33:29] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0
bytes r/w instead of 113 (errno=104)
[Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:mortar1:
connection to server disconnected
[Jun 21 19:33:29] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0
bytes r/w instead of 113 (errno=104)
[Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:mortar2:
connection to server disconnected
[Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
[Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
[Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:mortar2:
connection to server disconnected
[Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
[Jun 21 19:33:29] [CRITICAL/client-protocol.c:218/call_bail()]
client/protocol:bailing transport
[Jun 21 19:33:29] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0
bytes r/w instead of 113 (errno=115)
[Jun 21 19:33:29] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:mortar1:
connection to server disconnected
[Jun 21 19:33:29] [ERROR/client-protocol.c:204/client_protocol_xfer()]
protocol/client:transport_submit failed
I'm going to setup the debug xlator tomorrow if no one has anything off the
tops of their heads about what might be wrong
we haven't tested heavy read load yet, just writes
we have managed to cause it multiple times, but haven't pinned down a cause
as the debug logging all spits out basically the same material
the client also has fairly high CPU usage during the test, roughly 90% of the
core its on
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel