Re: Healing queue rarely empty

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Le 17/12/2015 10:10, Nicolas Ecarnot a écrit :
Hello,

Our setup : 3 Centos 7.2 nodes, with gluster 3.7.6 in replica-3, used as
storage+compute for an oVirt 3.5.6 DC.

Two days ago, we added some nagios/centreon monitoring watching every 5
minutes the state of the heal queue :
(something like "gluster volume heal some_vol info" with the adequate
grep).

I expected the "Number of entries" of every node to appear in the graph
as a flat zero line, most of the times, except for the rare cases of
node reboot, after which healing is launched and takes some minutes
(sometimes hours) but is doing good.

Instead, we see that the healing queue is doing 2 or 3 files healing say
4 times an hour. All day long.

Our DC is a small one, and has few VMs, so not more than only 8 big
files are stored in glusterfs.
I'm very surprised to see that these files constantly need healing, as I
thought I've understood that read/writes were synchronous at every time,
and replica-3 meant that every files were absolutely synced and commited
at all time.

I've also read about the 10 minutes cron-like job of the self-healing
daemon, which we are using by default, but this is a second point.

The first point leads to :
- Why do we see so frequent desynchronizations between nodes?
- Can I confirm that reading which logs?
- What must I check?


Self-replying, but as I found :
https://www.mail-archive.com/gluster-users%40gluster.org/msg20611.html

could this make sense to be surprised to see that :

gluster volume get data cluster.op-version
Option Value ------ -----
cluster.op-version                      30600

in a 3.7.6 gluster cluster?

I have absolutely no idea of what this means nor how this changes anything. But I see many things in my logs like :

Server and Client lk-version numbers are not same, reopening the fds

and

many many errors in etc-glusterfs-glusterd.vol.log about
missing options, other points like 'Unable to release lock', very frequent vol reqs :
http://pastebin.com/e6nQfeLx

What is op-version used for?

--
Nicolas ECARNOT
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users




[Index of Archives]     [Gluster Development]     [Linux Filesytems Development]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux