-------- Original-Nachricht -------- > Datum: Fri, 05 Jun 2009 22:54:49 +0200 > Von: "Steve" <steeeeeveee@xxxxxxx> > An: gluster-devel@xxxxxxxxxx > Betreff: very strange issue with 2.0.1 > I have a very strange issue with 2.0.1. I have 2 systems. On each system > there is server AND client running. The 2 servers are using serverside > afr/replicate and the client on each is connected to a single brick/volume > exported on his local server. The client does not know anything about the other > server. > > Now when I benchmark and look what get's over the wire, then I can not get > more then +/- 14MB/s. No matter what performance translators I > enable/disable. The speed is always around 14MB/s. > > Now if I put NUFA on top of replicate, things change. I get faster > transfer but the file written does not get transferred to the other server (I see > NO network traffic in that regard). But if I go on the other server and do > a simple "ls", then network traffic goes up to +/- 50MB/s and the file > shows up on the other server/client. > > That sound sounds to me normal (well... that's probably NUFA responsible > in favoring the local disc). > > However... after the "ls" command on the server/client where the file was > transferred to, the other server/client crashes with the following log: > +------------------------------------------------------------------------------+ > [2009-06-05 22:31:48] N [afr.c:2190:notify] gfs-srv-ds-replicate: > Subvolume 'gfs-srv-ds-locks' came back up; going online. > [2009-06-05 22:31:48] N [afr.c:2190:notify] gfs-srv-ds-replicate: > Subvolume 'gfs-srv-ds-locks' came back up; going online. > [2009-06-05 22:31:48] N [afr.c:2190:notify] gfs-srv-ds-replicate: > Subvolume 'gfs-srv-ds-locks' came back up; going online. > [2009-06-05 22:31:48] N [afr.c:2190:notify] gfs-srv-ds-replicate: > Subvolume 'gfs-srv-ds-locks' came back up; going online. > [2009-06-05 22:31:48] N [glusterfsd.c:1152:main] glusterfs: Successfully > started > [2009-06-05 22:31:48] N [client-protocol.c:5557:client_setvolume_cbk] > gfs-srv-ds-remote: Connected to 192.168.0.77:6997, attached to remote volume > 'gfs-srv-ds-locks'. > [2009-06-05 22:31:48] N [client-protocol.c:5557:client_setvolume_cbk] > gfs-srv-ds-remote: Connected to 192.168.0.77:6997, attached to remote volume > 'gfs-srv-ds-locks'. > [2009-06-05 22:31:51] N [server-protocol.c:7035:mop_setvolume] > gfs-srv-ds-server: accepted client from 192.168.0.77:1021 > [2009-06-05 22:31:51] N [server-protocol.c:7035:mop_setvolume] > gfs-srv-ds-server: accepted client from 127.0.0.1:1023 > [2009-06-05 22:31:51] N [server-protocol.c:7035:mop_setvolume] > gfs-srv-ds-server: accepted client from 127.0.0.1:1022 > [2009-06-05 22:31:51] N [server-protocol.c:7035:mop_setvolume] > gfs-srv-ds-server: accepted client from 192.168.0.77:1020 > pending frames: > frame : type(1) op(LOOKUP) > > patchset: 5c1d9108c1529a1155963cb1911f8870a674ab5b > signal received: 11 > configuration details:argp 1 > backtrace 1 > dlfcn 1 > fdatasync 1 > libpthread 1 > llistxattr 1 > setfsid 1 > spinlock 1 > xattr.h 1 > st_atim.tv_nsec 1 > package-string: glusterfs 2.0.1 > [0xffffe400] > /usr/lib/glusterfs/2.0.1/xlator/protocol/client.so(client_lookup+0x96)[0xb75573fb] > /usr/lib/glusterfs/2.0.1/xlator/cluster/replicate.so(afr_lookup+0x22f)[0xb7517a05] > /usr/lib/glusterfs/2.0.1/xlator/cluster/nufa.so(nufa_lookup+0x3ea)[0xb7503fa9] > /usr/lib/glusterfs/2.0.1/xlator/performance/io-threads.so(iot_lookup_wrapper+0xa5)[0xb74e8415] > /usr/lib/libglusterfs.so.0(call_resume+0x344)[0xb7f47e01] > /usr/lib/glusterfs/2.0.1/xlator/performance/io-threads.so(iot_worker_unordered+0x20)[0xb74e5895] > /lib/libpthread.so.0[0xb7f154cf] > /lib/libc.so.6(clone+0x5e)[0xb7e9b27e] > --------- > > > Now my questions: > > 1) Is this issue known? I can reproduce that error and therefore I could > send more info if needed. > > 2) Why is Server 1 AFR/Replicate <-> Server 2 AFR/Replicate so slow? Just > 14MB/s on GigE seems slow to me. Writing directly to the local disk > (without GlusterFS) delivers +/- 57MB/s. Going over NFSv4 delivers +/- 45MB/s. > Going over SSH delivers +/- 33MB/s. Transfer from Server 1 to Server 2 with > the GlusterFS log delivers +/- 45MB/s. Just pure Server to Server with > AFR/Replicate only delivers 14MB/s. Why? > Ach! I did more testing and the conclusion is: Using 128KB chunks for writing changes the speed. Local disk write is then almost 90MB/s and GlusterFS is around 45MB/s. I guess I have no real speed issue. I would love to come close to 90MB/s but 45MB/s is fine. However... the problem with the crash is still there if using NUFA. Maybe I just messed up and tried to many different (and obscure) combinations because NUFA never failed on me in the past? > // Steve > // Steve -- GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT! Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01