Re: non-blocking connect() returned: 111 (Connection refused)

Jordi Moles Blanco <jordi@xxxxxxxxx> · Wed, 17 Dec 2008 12:52:34 +0100

En/na Jordi Moles Blanco ha escrit:
Hi,

i've got 6 nodes providing a storage unit with gluster 2.5 patch 800. 
They are set in 2 groups of 3 nodes each.

On top of that, i've got a Xen 3.2 machine storing its virtual 
machines in gluster mount point.

The thing is that i used to have only 2 nodes for group, that's 4 
nodes in total, and today I'm trying to add 1 extra node for each group.

This is the final setting on Xen's Side:

**************

volume espai1
       type protocol/client
       option transport-type tcp/client
       option remote-host 10.0.0.3
       option remote-subvolume espai
end-volume

volume espai2
       type protocol/client
       option transport-type tcp/client
       option remote-host 10.0.0.4
       option remote-subvolume espai
end-volume

volume espai3
       type protocol/client
       option transport-type tcp/client
       option remote-host 10.0.0.5
       option remote-subvolume espai
end-volume

volume espai4
   type protocol/client
   option transport-type tcp/client
   option remote-host 10.0.0.6
   option remote-subvolume espai
end-volume

volume espai5
   type protocol/client
   option transport-type tcp/client
   option remote-host 10.0.0.7
   option remote-subvolume espai
end-volume

volume espai6
   type protocol/client
   option transport-type tcp/client
   option remote-host 10.0.0.8
   option remote-subvolume espai
end-volume

volume namespace1
       type protocol/client
       option transport-type tcp/client
       option remote-host 10.0.0.4
       option remote-subvolume nm
end-volume

volume namespace2
       type protocol/client
       option transport-type tcp/client
       option remote-host 10.0.0.5
       option remote-subvolume nm
end-volume

volume grup1
       type cluster/afr
       subvolumes espai1 espai3 espai5
end-volume

volume grup2
       type cluster/afr
       subvolumes espai2 espai4 espai6
end-volume

volume nm
       type cluster/afr
       subvolumes namespace1 namespace2
end-volume

volume g01
       type cluster/unify
       subvolumes grup1 grup2
       option scheduler rr
       option namespace nm
end-volume

volume io-cache
       type performance/io-cache
       option cache-size 512MB
       option page-size 1MB
       option force-revalidate-timeout 2
       subvolumes g01
end-volume 

**************

so... i stopped all virtual machines, unmounted gluster on Xen, 
updated the spec file (the one above) and ran gluster again in Xen.

I've set different gluster environments but i had never tried this, 
and now i'm facing some problems.

For what i had read before this... i used to think that when adding 
and extra node to a group and "remounting" on client's side, the 
Healing feature would copy all the content of the other nodes already 
present in the group to the "new one". That hasn't happened, even when 
I've tried to force the file system, by listing the files or doing 
what you suggest in you documentation:

**********

find /mnt/glusterfs -type f -print0 | xargs -0 head -c1 >/dev/null

**********

so... my first question would be... does "self-healing" work this way? 
If it doesn't.... which is the best way to add a node to a group? Do i 
have to run a "copy" command manually to get the new node ready?
I've also noticed that i have necessarily to umount gluster from Xen. 
Is there a way to avoid stopping all the virtual machines, umounting 
and mounting again? Is there a feature like "refresh config file"?

And finally... i looked into the logs to see why self-healing wasn't 
working, and i found this on Xen's Side:

**********
2008-12-17 12:08:30 E [tcp-client.c:190:tcp_connect] espai6: 
non-blocking connect() returned: 111 (Connection refused)
**********

and it keeps saying this when i want to access  files which were 
created in the "old nodes".

is this a bug? how can i work around this?

If i create new stuff, though, it replicates to the 3 nodes, no 
problem with that.... the only problem is with the old files that were 
already present before i added the new node.

Thanks for your help in advance, and let me know if you need any 
further information.

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxx
http://lists.nongnu.org/mailman/listinfo/gluster-devel

well... i've found out something else...

in my previous mail i said that if i create new files,the replicate into 
all the nodes of the group. I said this because if i do "ls" on the 
"shared folder" on the new nodes, they "apparently" have the new content 
as well.
However, if i shutdown the 2 old nodes, Xen can't see anything at all 
from gluster, the mount point is like "stale" and even the new content 
is not available for Xen. If i look at the log files from nodes (the new 
ones), they log "activity" in the file system, but Xen can't get to the 
data.

Thanks for you help.