Re: Buggy writebehind translators !!!

"Amar S. Tumballi" <amar@xxxxxxxxxxxxx> · Mon, 25 Jun 2007 02:42:55 +0530

Hi Teo,
"option transport-timeout 20" is less. our default option itself is 120.
Can you increase it? may be around ~600?

-bulde

On 6/25/07, Constantin Teodorescu <teo@xxxxxxx> wrote:

Anand Avati wrote:
> Teo,
> If  you are using glusterfs--mainline--2.4, please add to the
> write-behind
> 'option flush-behind off' and if needed 'option transport-timeout
> <secs>' to
> protocol/client volumes (where <secs> is sufficiently large enough). the
> optoin flush-behind should most likely fix your error.
>
> Instead you could just tla update to the latest patch (patch-184) and
the
> errors should disappear.

Added 'option flush-behind off' and 'option transport-timeout 20'
This time it didn't crash right from the begining ... 2 updates and 1
vacuum were OK.

glu=# update animal set observatii='ok1';
UPDATE 713268

glu=# update animal set observatii='ok2';
UPDATE 713268

glu=# vacuum;
VACUUM

glu=# update animal set observatii='ok3';
ERROR:  could not read block 206 of relation
534643271/534643272/534643273: File descriptor in bad state

Thank you for your patience !
Teo

LOGS
Client
[Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched!
pattern = *, filename = 534643273,
[Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched!
pattern = *, filename = 534643276,
[Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched!
pattern = *, filename = 534643278,
[Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched!
pattern = *, filename = 534643281,
[Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched!
pattern = *, filename = 534643285,
[Jun 24 23:19:42] [DEBUG/afr.c:65/afr_get_num_copies()] afr:matched!
pattern = *, filename = 534643287,
[Jun 24 23:20:30] [ERROR/common-utils.c:55/full_rw()]
libglusterfs:full_rw: 56359 bytes r/w instead of 131158 (errno=104)
[Jun 24 23:20:30]
[DEBUG/protocol.c:331/gf_block_unserialize_transport()]
libglusterfs/protocol:gf_block_unserialize_transport: full_read of block
failed
[Jun 24 23:20:30]
[DEBUG/client-protocol.c:2609/client_protocol_cleanup()]
protocol/client:cleaning up state in transport object 0x86ca020
[Jun 24 23:20:30] [CRITICAL/tcp.c:81/tcp_disconnect()]
transport/tcp:client1: connection to server disconnected
--------------------------------------------------
Server ( I noticed that server2 and server3 didn't show any CRITICAL
error in their logs, just server1 had problems)

[Jun 24 23:23:28] [ERROR/common-utils.c:110/full_rwv()]
libglusterfs:full_rwv: 98464 bytes r/w instead of 131281 (Connection
reset by peer)
[Jun 24 23:23:28] [ERROR/proto-srv.c:117/generic_reply()]
protocol/server:transport_writev failed
[Jun 24 23:23:28] [ERROR/tcp.c:110/tcp_except()] transport/tcp:shutdown
() - error: Transport endpoint is not connected
[Jun 24 23:23:28] [ERROR/common-utils.c:55/full_rw()]
libglusterfs:full_rw: 129983 bytes r/w instead of 131299 (errno=107)
[Jun 24 23:23:28]
[DEBUG/protocol.c:331/gf_block_unserialize_transport()]
libglusterfs/protocol:gf_block_unserialize_transport: full_read of block
failed
[Jun 24 23:23:28] [DEBUG/proto-srv.c:2826/open_file_cleanup_fn()]
protocol/server:force releaseing file 0x8051d30
[Jun 24 23:23:28] [DEBUG/proto-srv.c:2826/open_file_cleanup_fn()]
protocol/server:force releaseing file 0x8051c90
[Jun 24 23:23:28] [DEBUG/proto-srv.c:2867/proto_srv_cleanup()]
protocol/server:cleaned up xl_private of 0x804b1d0
[Jun 24 23:23:28] [CRITICAL/tcp.c:81/tcp_disconnect()]
transport/tcp:server: connection to server disconnected
[Jun 24 23:23:28] [DEBUG/tcp-server.c:229/gf_transport_fini()]
tcp/server:destroying transport object for 29.72.76.22:1021 (fd=5)

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel

--
Amar Tumballi
http://amar.80x25.org
[bulde on #gluster/irc.gnu.org]