Hi,
On 16 October 2018 at 18:20, <jring@xxxxxxx> wrote:
Hi everybody,
I have created a distributed dispersed volume on 4.1.5 under centos7 like this a few days ago:
gluster volume create data_vol1 disperse-data 4 redundancy 2 transport tcp \
\
gf-p-d-01.isec.foobar.com:/bricks/brick1/brick \
gf-p-d-03.isec.foobar.com:/bricks/brick1/brick \
gf-p-d-04.isec.foobar.com:/bricks/brick1/brick \
gf-p-k-01.isec.foobar.com:/bricks/brick1/brick \
gf-p-k-03.isec.foobar.com:/bricks/brick1/brick \
gf-p-k-04.isec.foobar.com:/bricks/brick1/brick \
\
gf-p-d-01.isec.foobar.com:/bricks/brick2/brick \
gf-p-d-03.isec.foobar.com:/bricks/brick2/brick \
gf-p-d-04.isec.foobar.com:/bricks/brick2/brick \
gf-p-k-01.isec.foobar.com:/bricks/brick2/brick \
gf-p-k-03.isec.foobar.com:/bricks/brick2/brick \
gf-p-k-04.isec.foobar.com:/bricks/brick2/brick \
\
... same for brick3 to brick9...
\
gf-p-d-01.isec.foobar.com:/bricks/brick10/brick \
gf-p-d-03.isec.foobar.com:/bricks/brick10/brick \
gf-p-d-04.isec.foobar.com:/bricks/brick10/brick \
gf-p-k-01.isec.foobar.com:/bricks/brick10/brick \
gf-p-k-03.isec.foobar.com:/bricks/brick10/brick \
gf-p-k-04.isec.foobar.com:/bricks/brick10/brick
This worked nicely and resulted in the following filesystem:
[root@gf-p-d-01 ~]# df -h /data/
Dateisystem Größe Benutzt Verf. Verw% Eingehängt auf
gf-p-d-01.isec.foobar.com:/data_vol1 219T 2,2T 217T 2% /data
Each of the bricks resides on its own 6TB disk with 1 big partition formated with xfs.
Yesterday a colleague looked at the filesystem and found some space missing...
[root@gf-p-d-01 ~]# df -h /data/
Filesystem Size Used Avail Use% Mounted on
gf-p-d-01.isec.foobar.com:/data_vol1 22T 272G 22T 2% /data
Some googling brought the following bug report against 3.4 which looks familiar:
https://bugzilla.redhat.com/show_bug.cgi?id=1541830
So we did a quick grep shared-brick-count /var/lib/glusterd/vols/data_vol1/* on all boxes and found that on 5 out of 6 boxes this was shared-brick-count=0 for all bricks on remote boxes and 1 for local bricks.
Is this the expected result or should we have all 1 everywhere (as the quick fix script from the case sets it)?
No , this is fine. The shared-brick-count only needs to be 1 for the local bricks. The value for the remote bricks can be 0.
Also on one box (the one where we created the volume from, btw) we have shared-brick-count=0 for all remote bricks and 10 for the local bricks.
This is a problem. The shared-brick-count should be 1 for the local bricks here as well.
Is it possible that the bug from 3.4 still exists in 4.1.5 and should we try the filter script which sets shared-brick-count=1 for all bricks?
Can you try
1. restarting glusterd on all the nodes one after another (not at the same time)
2. Setting a volume option (say gluster volume set <volname> cluster.min-free-disk 11%)
and see if it fixes the issue?
Regards,
Nithya
The volume is not currently in production so now would be the time to play around and find the problem...
TIA and regards,
Joachim
------------------------------------------------------------ ------------------------------ -------
FreeMail powered by mail.de - MEHR SICHERHEIT, SERIOSITÄT UND KOMFORT
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx https://lists.gluster.org/mailman/listinfo/gluster-users