RE: Problems with unify

Antonio González <antonio.gonzalez@xxxxxxxxxx> · Tue, 22 Apr 2008 18:17:18 +0200

Hello krishna, 

I test the command "echo 0 > /proc/sys/net/ipv4/tcp_retries2"  on server and
the result are the same. If a try a "ls" from a client, it takes 6:00
minutes aprox and returns "Transport endpoint is not conennected.".Other
clients can not connect to glusterfs file system. The server is blocked all
time, is necessary to reboot the server.

Thanks,  

-----Mensaje original-----
De: krishna.zresearch@xxxxxxxxx [mailto:krishna.zresearch@xxxxxxxxx] En
nombre de Krishna Srinivas
Enviado el: martes, 22 de abril de 2008 14:29
Para: Antonio González
CC: Gluster Devel
Asunto: Re: Problems with unify

Hi Antonio,

I have not tried to reproduce the problem, but I can guess what might be
happening. Can you do "echo 0 > /proc/sys/net/ipv4/tcp_retries2" on server
and check how much time your server and clients take to come back to
life again? It might take several minutes.

This is definitely a bug and we will fix it.

Thanks
Krishna

On Tue, Apr 22, 2008 at 5:55 PM, Antonio González
<antonio.gonzalez@xxxxxxxxxx> wrote:
>
>
>
>
>
>
> Hello Krishna,
>
>
>
> I have made more test to try to clarify the problem, I hope this
information
> helps you.
>
>
>
> I have tried with a simple schema. One server that exports one brick (only
> posix storage and tcp/server)and three clients (only tcp/client).
>
>
>
> The test is:
>
>
>
> §         From client1 I make a "cp /home/element1 /mnt/gluster".
>
> §         When client one is making the cp I unplugged the power cable.
>
> §         From client2 I make a "ls" command. The client2 is blocked.
>
> §         If I try any operation from client3, it is blocked also.
>
> §         The client2 is blocked during 3/4 minutes, then it shows the
> message "ls: cannot open directory .: Transport endpoint is not
conennected.
>
> §         The logs of bloecked clients says nothing.
>
> §         If i try to connect other client to the glusterfs file system,
the
> connect is not posible and the log of client says:
> [client-protocol.c:279:client_protocol_xfer] trans: attempting to pipeline
> request type(2) op(4) with handshake.
>
>
>
> Krisnha, I need to know if you have been able to reproduce the bug, I must
> expose the GlusterFS project and I must know at least if this issue is a
new
> bug and if developers will work to solvent this.
>
>
>
>
>
>
>
>
>
> -----Mensaje original-----
>  De: gluster-devel-bounces+antonio.gonzalez=libera.net@xxxxxxxxxx
> [mailto:gluster-devel-bounces+antonio.gonzalez=libera.net@xxxxxxxxxx] En
> nombre de Antonio González
>  Enviado el: lunes, 21 de abril de 2008 20:01
>  Para: 'Gluster Devel'
>  Asunto: Problems with unify
>
>
>
>
>
> Hello Krishna,
>
>
>
>
>
>
>
> Can you reproduce the bug?? I have made more test about this issue, I
>
> comment you my impressions:
>
>
>
> *         The bug happens when the client (that goes down) tries a write
>
> operation over the GlusterFS (cp local glusterfs) and the other client
tries
>
> an "ls" command or and "find /mnt/glusterfs -type f -print0 | xargs -0
head
>
> -c1 >/dev/null" command. With other commands I don't be able to reproduce
>
> the bug.
>
>
>
> *         I don't be able to reproduce the bug with a read operation in
the
>
> client that goes down. Only write operations (cp local glusterf).
>
>
>
> *         I can see that some times when I try to reproduce the bug (with
a
>
> write operation) the bug not happens, I don't know the reason. The
majority
>
> of times the bug is reproduced.
>
>
>
> *         I test with a schema without unify translator, in place this I
use
>
> the stripe translator; I'm be able to reproduce the same bug with this
>
> configuration too.
>
>
>
>
>
>
>
>
>
>
>
> I hope that these points help you to clarify the problem.
>
>
>
>
>
>
>
> Thanks.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
> _______________________________________________
>
> Gluster-devel mailing list
>
> Gluster-devel@xxxxxxxxxx
>
> http://lists.nongnu.org/mailman/listinfo/gluster-devel