after copying some few thousands files and deleting and copying again i get a lot of errors: File descriptor in bad state No such file or directory and a lot of [Jun 26 05:45:13] [CRITICAL/tcp.c:81/tcp_disconnect()] transport/tcp:server: connection to server disconnected [Jun 26 05:45:13] [ERROR/common-utils.c:55/full_rw()] libglusterfs:full_rw: 0 bytes r/w instead of 113 (errno=9) in the glusterd.log I have set it up like this: 1.3-pre4 5 servers + 5 clients (running on same boxes as servers). what could cause the disconnection ? server: volume gfs type storage/posix option directory /mnt/gluster/gfs1 end-volume volume gfs-afr type storage/posix option directory /mnt/gluster/afr-gfs1 end-volume volume server type protocol/server option transport-type tcp/server option listen-port 6996 subvolumes gfs gfs-afr option auth.ip.gfs.allow * option auth.ip.gfs-afr.allow * end-volume client: volume gfs type storage/posix option directory /mnt/gluster/gfs1 end-volume volume gfs-afr type storage/posix option directory /mnt/gluster/afr-gfs1 end-volume volume server type protocol/server option transport-type tcp/server option listen-port 6996 subvolumes gfs gfs-afr option auth.ip.gfs.allow * option auth.ip.gfs-afr.allow * end-volume [root@hd-t1157cl etc]# cat cluster-client.vol volume a1 type protocol/client option transport-type tcp/client option remote-host 10.47.0.10 option remote-port 6996 option remote-subvolume gfs end-volume volume a2 type protocol/client option transport-type tcp/client option remote-host 10.47.0.10 option remote-port 6996 option remote-subvolume gfs-afr end-volume volume b1 type protocol/client option transport-type tcp/client option remote-host 10.47.0.11 option remote-port 6996 option remote-subvolume gfs end-volume volume b2 type protocol/client option transport-type tcp/client option remote-host 10.47.0.11 option remote-port 6996 option remote-subvolume gfs-afr end-volume volume c1 type protocol/client option transport-type tcp/client option remote-host 10.47.0.12 option remote-port 6996 option remote-subvolume gfs end-volume volume c2 type protocol/client option transport-type tcp/client option remote-host 10.47.0.12 option remote-port 6996 option remote-subvolume gfs-afr end-volume volume d1 type protocol/client option transport-type tcp/client option remote-host 10.47.0.13 option remote-port 6996 option remote-subvolume gfs end-volume volume d2 type protocol/client option transport-type tcp/client option remote-host 10.47.0.13 option remote-port 6996 option remote-subvolume gfs-afr end-volume volume e1 type protocol/client option transport-type tcp/client option remote-host 10.47.0.14 option remote-port 6996 option remote-subvolume gfs end-volume volume e2 type protocol/client option transport-type tcp/client option remote-host 10.47.0.14 option remote-port 6996 option remote-subvolume gfs-afr end-volume volume afr1 type cluster/afr subvolumes a1 e2 option replicate *:2 end-volume volume afr2 type cluster/afr subvolumes b1 d2 option replicate *:2 end-volume volume afr3 type cluster/afr subvolumes c1 a2 option replicate *:2 end-volume volume afr4 type cluster/afr subvolumes d1 b2 option replicate *:2 end-volume volume afr5 type cluster/afr subvolumes e1 c2 option replicate *:2 end-volume volume gfstest type cluster/unify subvolumes afr1 afr2 afr3 afr4 afr5 option scheduler rr option rr.limits.min-free-disk 5GB end-volume On 6/26/07, Sebastien LELIEVRE <slelievre@xxxxxxxxxxxxxxxx> wrote:
Hi again ! Shai DB a écrit : > another question > I notice that 1.2 dont have the AFR on its source > how can i use/install it anyway ? > i saw 1.3-pre has it.. > is the 1.3-pre OK for production ? > thanks > I had forgotten this point ! :) Yes, 1.3-pre4 archive is stable enough for production, but you can also use the tla repository with the branch 2.4 which is stable enough (to me) to be used in production. Just note that the 1.3 stable release will be based on the 2.5 mainbranch and will include self-heal feature (and many more !) Cheers, Sebastien LELIEVRE slelievre@xxxxxxxxxxxxxxxx Services to ISP TBS-internet http://www.TBS-internet.com > I need it for replication (to have 2 copies of data in case of crash) > > > On 6/26/07, *Sebastien LELIEVRE* <slelievre@xxxxxxxxxxxxxxxx > <mailto:slelievre@xxxxxxxxxxxxxxxx>> wrote: > > Hi, > > I just wanted to stress this : > > Shai a écrit : > > Hello, we are testing glusterfs 1.2 and I have few questions - > > 1.2 doesn't bring "self-heal" with it, so keep in mind that if a drives > crashes, you would have to sync your new drive "manually" with the > others. > > > so to just copy all data to the replaced disk from his afr 'pair' ? > > > BUT, 1.3 is going to correct this, and this is good :) > > That's all I had to add > > Cheers, > > Sebastien LELIEVRE > slelievre@xxxxxxxxxxxxxxxx > <mailto:slelievre@xxxxxxxxxxxxxxxx> Services to ISP > TBS-internet http://www.TBS-internet.com > > Krishna Srinivas a écrit : > > As of now you need to restart glusterfs if there is any change > > in the config spec file. However in future versions you wont need > > to remount (This is in our road map) > > > > On 6/25/07, Shai DB <dbshai@xxxxxxxxx <mailto:dbshai@xxxxxxxxx>> > wrote: > >> thanks for the answer > >> this seems easy and neat to setup > >> > >> another question is, if i add 2 more nodes to the gang > >> how can i setup all the clients with the new configuration, without > >> need to > >> 'remount' the glusterfs ? > >> > >> Thanks > >> > >> > >> On 6/25/07, Krishna Srinivas <krishna@xxxxxxxxxxxxx > <mailto:krishna@xxxxxxxxxxxxx>> wrote: > >> > > >> > On 6/25/07, Shai DB < dbshai@xxxxxxxxx > <mailto:dbshai@xxxxxxxxx>> wrote: > >> > > Hello, we are testing glusterfs 1.2 and I have few questions - > >> > > > >> > > > >> > > 1. we are going to store millions of small jpg files that > will be > >> read > >> > by > >> > > webserver - is glusterfs good solution for this ? > >> > > >> > Yes, definitely. > >> > > >> > > 2. we are going to run both server+clients on each node > together with > >> > apache > >> > > > >> > > 3. replicate *:2 > >> > > > >> > > the way i think doing replicate is defining on each server 2 > >> volumes and > >> > > using AFR: > >> > > > >> > > server1: a1, a2 > >> > > server2: b1, b2 > >> > > server3: c1, c2 > >> > > server4: d1, d2 > >> > > server5: e1, e2 > >> > > > >> > > afr1: a1+b2 > >> > > afr2: b1+c2 > >> > > afr3: c1+d2 > >> > > afr4: d1+e2 > >> > > afr5: e1+a2 > >> > > > >> > > and then unify = afr1+afr2+afr3+afr4+afr5 with replicate option > >> > > > >> > > is this correct way ? > >> > > and what to do on the future when we add more nodes ? when > >> changing the > >> > afr > >> > > (adding and changing the couples) making glusterfs > >> > > redistribute the files the new way ? > >> > > >> > Yes this is the right way. If you add one more server f, the one > >> solution > >> > is to move contents of a2 to f2 and clean up a2 and have it as > >> following: > >> > > >> > afr5: e1 + f2 > >> > afr6: f1 + a2 > >> > > >> > Cant think of an easier solution. > >> > > >> > But if we assume that you will always add 2 servers when you > want to > >> add, > >> > we can have the setup in following way: > >> > afr1: a1 + b2 > >> > afr2: b1 + a2 > >> > afr3: c1 + d2 > >> > afr4: d1 + c2 > >> > afr5: e1 + f2 > >> > afr6: f1 + e2 > >> > > >> > Now when you add a pair of servers to this (g, h): > >> > afr7: f1 + h2 > >> > afr8: h1 +f2 > >> > > >> > Which is very easy. But you will have to add 2 servers everytime. > >> > The advantage is that it is easier to visualize the setup and add > >> > new nodes. > >> > > >> > Thinking further, if we assume that you will replicate all the > files > >> > twice (option replicate *:2) you can have the following setup: > >> > afr1: a + b > >> > afr2: c + d > >> > afr3: e + f > >> > > >> > This is a very easy setup. It is simple to add a fresh pair > (afr4: g > >> +h) > >> > > >> > You can have whatever setup you want depending on your > >> > convinience and requirement. > >> > > >> > > > >> > > 4. what happens when a hard drive goes down and replaces, the > cluster > >> > also > >> > > redistribute the files ? > >> > > >> > When a hard drive is replaced, missing files will be replicated > from > >> the > >> > AFR's other child. > >> > > >> > Regards > >> > Krishna > >> > > >> > ------- > >> > > >> > The best quote ever : ' > >> >