Re: non-blocking connect() returned: 111 (Connection refused)

Jordi Moles Blanco <jordi@xxxxxxxxx> · Thu, 18 Dec 2008 09:17:14 +0100

En/na Raghavendra G ha escrit:
Hi Jordi,

Have you started glusterfsd on each of the newly added nodes? If not, 
please start them.

some comments have been inlined.

On Wed, Dec 17, 2008 at 3:28 PM, Jordi Moles Blanco <jordi@xxxxxxxxx 
<mailto:jordi@xxxxxxxxx>> wrote:

    Hi,

    i've got 6 nodes providing a storage unit with gluster 2.5 patch
    800. They are set in 2 groups of 3 nodes each.

    On top of that, i've got a Xen 3.2 machine storing its virtual
    machines in gluster mount point.

    The thing is that i used to have only 2 nodes for group, that's 4
    nodes in total, and today I'm trying to add 1 extra node for each
    group.

    This is the final setting on Xen's Side:

    **************

    volume espai1
          type protocol/client
          option transport-type tcp/client
          option remote-host 10.0.0.3
          option remote-subvolume espai
    end-volume

    volume espai2
          type protocol/client
          option transport-type tcp/client
          option remote-host 10.0.0.4
          option remote-subvolume espai
    end-volume

    volume espai3
          type protocol/client
          option transport-type tcp/client
          option remote-host 10.0.0.5
          option remote-subvolume espai
    end-volume

    volume espai4
      type protocol/client
      option transport-type tcp/client
      option remote-host 10.0.0.6
      option remote-subvolume espai
    end-volume

    volume espai5
      type protocol/client
      option transport-type tcp/client
      option remote-host 10.0.0.7
      option remote-subvolume espai
    end-volume

    volume espai6
      type protocol/client
      option transport-type tcp/client
      option remote-host 10.0.0.8
      option remote-subvolume espai
    end-volume

    volume namespace1
          type protocol/client
          option transport-type tcp/client
          option remote-host 10.0.0.4
          option remote-subvolume nm
    end-volume

    volume namespace2
          type protocol/client
          option transport-type tcp/client
          option remote-host 10.0.0.5
          option remote-subvolume nm
    end-volume

    volume grup1
          type cluster/afr
          subvolumes espai1 espai3 espai5
    end-volume

    volume grup2
          type cluster/afr
          subvolumes espai2 espai4 espai6
    end-volume

    volume nm
          type cluster/afr
          subvolumes namespace1 namespace2
    end-volume

    volume g01
          type cluster/unify
          subvolumes grup1 grup2
          option scheduler rr
          option namespace nm
    end-volume

    volume io-cache
          type performance/io-cache
          option cache-size 512MB
          option page-size 1MB
          option force-revalidate-timeout 2
          subvolumes g01
    end-volume  

    **************

    so... i stopped all virtual machines, unmounted gluster on Xen,
    updated the spec file (the one above) and ran gluster again in Xen.

    I've set different gluster environments but i had never tried
    this, and now i'm facing some problems.

    For what i had read before this... i used to think that when
    adding and extra node to a group and "remounting" on client's
    side, the Healing feature would copy all the content of the other
    nodes already present in the group to the "new one". That hasn't
    happened, even when I've tried to force the file system, by
    listing the files or doing what you suggest in you documentation:

    **********

    find /mnt/glusterfs -type f -print0 | xargs -0 head -c1 >/dev/null

    **********

    so... my first question would be... does "self-healing" work this
    way? If it doesn't.... which is the best way to add a node to a
    group? Do i have to run a "copy" command manually to get the new
    node ready?
    I've also noticed that i have necessarily to umount gluster from
    Xen. Is there a way to avoid stopping all the virtual machines,
    umounting and mounting again? Is there a feature like "refresh
    config file"?

Hot add ("refresh config file") is in the roadmap.

    And finally... i looked into the logs to see why self-healing
    wasn't working, and i found this on Xen's Side:

    **********
    2008-12-17 12:08:30 E [tcp-client.c:190:tcp_connect] espai6:
    non-blocking connect() returned: 111 (Connection refused)
    **********

    and it keeps saying this when i want to access  files which were
    created in the "old nodes".

    is this a bug? how can i work around this?

    If i create new stuff, though, it replicates to the 3 nodes, no
    problem with that.... the only problem is with the old files that
    were already present before i added the new node.

    Thanks for your help in advance, and let me know if you need any
    further information.

    _______________________________________________
    Gluster-devel mailing list
    Gluster-devel@xxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxx>
    http://lists.nongnu.org/mailman/listinfo/gluster-devel

--
Raghavendra G

hi, yes.

when gluster behaves like this, all nodes are running. As i said, when 
you create new data, it replicates to all the nodes of each group, so 
it's working fine.
However, it keeps logging "connection refused", which i though was 
reported only when a node wasn't available, but they are all available 
and replicating data fine.

The thing, though, is that old data is not beeing replicated into the 
new nodes?

Is there any way to "force" replication to the new nodes? Could i be 
getting somehow the "connection refused" because new nodes won't accept 
previous data?

Thanks for your help.