Hi Teo, "option transport-timeout 20" is less. our default option itself is 120. Can you increase it? may be around ~600? -bulde On 6/25/07, Constantin Teodorescu <teo@xxxxxxx> wrote:
Anand Avati wrote: > Teo, > If you are using glusterfs--mainline--2.4, please add to the > write-behind > 'option flush-behind off' and if needed 'option transport-timeout > <secs>' to > protocol/client volumes (where <secs> is sufficiently large enough). the > optoin flush-behind should most likely fix your error. > > Instead you could just tla update to the latest patch (patch-184) and the > errors should disappear. Added 'option flush-behind off' and 'option transport-timeout 20' This time it didn't crash right from the begining ... 2 updates and 1 vacuum were OK. glu=# update animal set observatii='ok1'; UPDATE 713268 glu=# update animal set observatii='ok2'; UPDATE 713268 glu=# vacuum; VACUUM glu=# update animal set observatii='ok3'; ERROR: could not read block 206 of relation 534643271/534643272/534643273: File descriptor in bad state Thank you for your patience ! Teo LOGS Client [Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched! pattern = *, filename = 534643273, [Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched! pattern = *, filename = 534643276, [Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched! pattern = *, filename = 534643278, [Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched! pattern = *, filename = 534643281, [Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched! pattern = *, filename = 534643285, [Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched! pattern = *, filename = 534643287, [Jun 24 23:20:30] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 56359 bytes r/w instead of 131158 (errno=104) [Jun 24 23:20:30] [DEBUG/protocol.c:331/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of block failed [Jun 24 23:20:30] [DEBUG/client-protocol.c:2609/client_protocol_cleanup()] protocol/client:cleaning up state in transport object 0x86ca020 [Jun 24 23:20:30] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:client1: connection to server disconnected -------------------------------------------------- Server ( I noticed that server2 and server3 didn't show any CRITICAL error in their logs, just server1 had problems) [Jun 24 23:23:28] [ERROR/common-utils.c:110/full_rwv()] libglusterfs:full_rwv: 98464 bytes r/w instead of 131281 (Connection reset by peer) [Jun 24 23:23:28] [ERROR/proto-srv.c:117/generic_reply()] protocol/server:transport_writev failed [Jun 24 23:23:28] [ERROR/tcp.c:110/tcp_except()] transport/tcp:shutdown () - error: Transport endpoint is not connected [Jun 24 23:23:28] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 129983 bytes r/w instead of 131299 (errno=107) [Jun 24 23:23:28] [DEBUG/protocol.c:331/gf_block_unserialize_transport()] libglusterfs/protocol:gf_block_unserialize_transport: full_read of block failed [Jun 24 23:23:28] [DEBUG/proto-srv.c:2826/open_file_cleanup_fn()] protocol/server:force releaseing file 0x8051d30 [Jun 24 23:23:28] [DEBUG/proto-srv.c:2826/open_file_cleanup_fn()] protocol/server:force releaseing file 0x8051c90 [Jun 24 23:23:28] [DEBUG/proto-srv.c:2867/proto_srv_cleanup()] protocol/server:cleaned up xl_private of 0x804b1d0 [Jun 24 23:23:28] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:server: connection to server disconnected [Jun 24 23:23:28] [DEBUG/tcp-server.c:229/gf_transport_fini()] tcp/server:destroying transport object for 29.72.76.22:1021 (fd=5) _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxx http://lists.nongnu.org/mailman/listinfo/gluster-devel
-- Amar Tumballi http://amar.80x25.org [bulde on #gluster/irc.gnu.org]