Re: Problems when using different hostnames in a bricks and a peer

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Atin,

You are right!!! I was using the version 3.5 in production. And when I've checked the Gluster source code, I checked the wrong commit (not the latest commit in the master branch).

Currently, you've already implemented my the proposed solution. It was done at the function gd_peerinfo_find_from_addrinfo, file xlators/mgmt/glusterd/src/glusterd-peer-utils.c.

Thanks for your tip! And sorry for any inconvenience.

--
Rarylson Freitas

On Thu, Jul 2, 2015 at 2:01 AM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
Which gluster version are you using? Better peer identification feature
(available 3.6 onwards) should tackle this problem IMO.

~Atin

On 07/02/2015 10:05 AM, Rarylson Freitas wrote:
> Hi,
>
> Recently, my company needed to change our hostnames used in the Gluster
> Pool.
>
> In a first moment, we have two Gluster Nodes called storage1 and storage2.
> Our volumes used two bricks: storage1:/MYVOLYME and storage2:/MYVOLUME. We
> put the storage1 and storage2 IPs in the /etc/hosts file of our nodes and
> in our client servers.
>
> After some time, more client servers started to using Gluster and we
> discovered that using hostnames without domain (using /etc/hosts) in all
> client servers is a pain in the a$$ :(. So, we decided to change them to
> something like storage1.mydomain.com and storage2.mydomain.com.
>
> Remember that, at this point, we had already some volumes (with bricks):
>
> $ gluster volume info MYVOL
> [...]
> Brick1: storage1:/MYDIR
> Brick1: storage2:/MYDIR
>
> For simplicity, let's consider that we had two Gluster Nodes, each one with
> the following entries in /etc/hosts:
>
> 10.10.10.1  storage1
> 10.10.10.2  storage2
>
> To implement the hostname changes, we've changed the etc hosts file to:
>
> 10.10.10.1  storage1 storage1.mydomain.com
> 10.10.10.2  storage2 storage2.mydomain.com
>
> And we've run in storage1:
>
> $ gluster peer probe storage2.mydomain.com
> peer probe: success
>
> Everything works well during some time, but the glusterd starts to fail
> after any reboot:
>
> $ service glusterfs-server status
> glusterfs-server start/running, process 14714
> $ service glusterfs-server restart
> glusterfs-server stop/waiting
> glusterfs-server start/running, process 14860
> $ service glusterfs-server status
> glusterfs-server stop/waiting
>
> To start the service again, it was necessary to rollback the hostname1
> config to storage2 in /var/lib/glusterd/peers/OUR_UUID.
>
> After some try and error, we discovered that if we change the order of the
> entries in /etc/hosts and repeat the process, everything worked.
>
> It is, from:
>
> 10.10.10.1  storage1 storage1.mydomain.com
> 10.10.10.2  storage2 storage2.mydomain.com
>
> To:
>
> 10.10.10.1  storage1.mydomain.com storage1
> 10.10.10.2  storage2.mydomain.com storage2
>
> And run:
>
> gluster peer probe storage2.mydomain.com
> service glusterfs-server restart
>
> So we've checked the Glusterd debug log and checked the GlusterFS source
> code and discovered that the big secret was the function
> glusterd_friend_find_by_hostname, in the file
> xlators/mgmt/glusterd/src/glusterd-utils.c. This function is called for
> each brick that isn't a local brick and does the following things:
>
>    - It checks if the brick hostname is equal to some peer hostname;
>    - If it's, this peer is our wanted friend;
>    - If not, it gets the brick IP (resolves the hostname using the function
>    getaddrinfo) and checks if the brick IP is equal to the peer hostname;
>       - It is, we could run gluster peer probe 10.10.10.2. Once the brick
>       IP (storage2 resolves to 10.10.10.2) would have equal to the peer
>       "hostname" (10.10.10.2);
>    - If it's, this peer is our wanted friend;
>    - If not, gets the reverse of the brick IP (using the function
>    getnameinfo) and checks if the brick reverse is equal to the peer
>    hostname;
>       - This is why changing the order of the entries in /etc/hosts worked
>       as an workaround for us;
>    - If not, returns and error (and Glusterd will fail).
>
> However, we think that comparing the brick IP (resolving the brick
> hostname) and the peer IP (resolving the peer hostname) would be a simpler
> and more comprehensive solution. Once both brick and peer will have
> difference hostnames, but the same IP, it would work.
>
> The solution could be:
>
>    - It checks if the brick hostname is equal to some peer hostname;
>    - If it's, this peer is our wanted friend;
>    - If not, it gets both the brick IP (resolves the hostname using the
>    function getaddrinfo) and the peer IP (resolves the peer hostname) and,
>    for each IP pair, check if a brick IP is equal to a peer IP;
>    - If it's, this peer is our wanted friend;
>    - If not, returns and error (and Glusterd will fail).
>
> What do you think about it?
> --
>
> *Rarylson Freitas*
> Computer Engineer
>
>
>
> _______________________________________________
> Gluster-devel mailing list
> Gluster-devel@xxxxxxxxxxx
> http://www.gluster.org/mailman/listinfo/gluster-devel
>

--
~Atin

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux