Re: Failover problems with gluster 3.8.8-1 (latest Debian stable)

Dave Sherohman <dave@xxxxxxxxxxxxx> · Thu, 15 Feb 2018 06:20:27 -0600

Well, it looks like I've stumped the list, so I did a bit of additional
digging myself:

azathoth replicates with yog-sothoth, so I compared their brick
directories.  `ls -R /var/local/brick0/data | md5sum` gives the same
result on both servers, so the filenames are identical in both bricks.
However, `du -s /var/local/brick0/data` shows that azathoth has about 3G
more data (445G vs 442G) than yog.

This seems consistent with my assumption that the problem is on
yog-sothoth (everything is fine with only azathoth; there are problems
with only yog-sothoth) and I am reminded that a few weeks ago,
yog-sothoth was offline for 4-5 days, although it should have been
brought back up-to-date once it came back online.

So, assuming that the issue is stale/missing data on yog-sothoth, is
there a way to force gluster to do a full refresh of the data from
azathoth's brick to yog-sothoth's brick?  I would have expected running
heal and/or rebalance to do that sort of thing, but I've run them both
(with and without fix-layout on the rebalance) and the problem persists.

If there isn't a way to force a refresh, how risky would it be to kill
gluster on yog-sothoth, wipe everything from /var/local/brick0, and then
re-add it to the cluster as if I were replacing a physically failed
disk?  Seems like that should work in principle, but it feels dangerous
to wipe the partition and rebuild, regardless.

On Tue, Feb 13, 2018 at 07:33:44AM -0600, Dave Sherohman wrote:
> I'm using gluster for a virt-store with 3x2 distributed/replicated
> servers for 16 qemu/kvm/libvirt virtual machines using image files
> stored in gluster and accessed via libgfapi.  Eight of these disk images
> are standalone, while the other eight are qcow2 images which all share a
> single backing file.
> 
> For the most part, this is all working very well.  However, one of the
> gluster servers (azathoth) causes three of the standalone VMs and all 8
> of the shared-backing-image VMs to fail if it goes down.  Any of the
> other gluster servers can go down with no problems; only azathoth causes
> issues.
> 
> In addition, the kvm hosts have the gluster volume fuse mounted and one
> of them (out of five) detects an error on the gluster volume and puts
> the fuse mount into read-only mode if azathoth goes down.  libgfapi
> connections to the VM images continue to work normally from this host
> despite this and the other four kvm hosts are unaffected.
> 
> It initially seemed relevant that I have the libgfapi URIs specified as
> gluster://azathoth/..., but I've tried changing them to make the initial
> connection via other gluster hosts and it had no effect on the problem.
> Losing azathoth still took them out.
> 
> In addition to changing the mount URI, I've also manually run a heal and
> rebalance on the volume, enabled the bitrot daemons (then turned them
> back off a week later, since they reported no activity in that time),
> and copied one of the standalone images to a new file in case it was a
> problem with the file itself.  As far as I can tell, none of these
> attempts changed anything.
> 
> So I'm at a loss.  Is this a known type of problem?  If so, how do I fix
> it?  If not, what's the next step to troubleshoot it?
> 
> 
> # gluster --version
> glusterfs 3.8.8 built on Jan 11 2017 14:07:11
> Repository revision: git://git.gluster.com/glusterfs.git
> 
> # gluster volume status
> Status of volume: palantir
> Gluster process                             TCP Port  RDMA Port  Online
> Pid
> ------------------------------------------------------------------------------
> Brick saruman:/var/local/brick0/data        49154     0          Y
> 10690
> Brick gandalf:/var/local/brick0/data        49155     0          Y
> 18732
> Brick azathoth:/var/local/brick0/data       49155     0          Y
> 9507 
> Brick yog-sothoth:/var/local/brick0/data    49153     0          Y
> 39559
> Brick cthulhu:/var/local/brick0/data        49152     0          Y
> 2682 
> Brick mordiggian:/var/local/brick0/data     49152     0          Y
> 39479
> Self-heal Daemon on localhost               N/A       N/A        Y
> 9614 
> Self-heal Daemon on saruman.lub.lu.se       N/A       N/A        Y
> 15016
> Self-heal Daemon on cthulhu.lub.lu.se       N/A       N/A        Y
> 9756 
> Self-heal Daemon on gandalf.lub.lu.se       N/A       N/A        Y
> 5962 
> Self-heal Daemon on mordiggian.lub.lu.se    N/A       N/A        Y
> 8295 
> Self-heal Daemon on yog-sothoth.lub.lu.se   N/A       N/A        Y
> 7588 
>  
> Task Status of Volume palantir
> ------------------------------------------------------------------------------
> Task                 : Rebalance           
> ID                   : c38e11fe-fe1b-464d-b9f5-1398441cc229
> Status               : completed           
>  
> 
> -- 
> Dave Sherohman
> _______________________________________________
> Gluster-users mailing list
> Gluster-users@xxxxxxxxxxx
> http://lists.gluster.org/mailman/listinfo/gluster-users

-- 
Dave Sherohman
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users