On one of the replica servers, the client mount did not have an open port to communicate with the other krfsd process. To illustrate:
root@serv1:/root> ps -ef | grep replicated_vol
root 30627 1 0 Jan29 ? 00:17:30 /usr/sbin/glusterfs --volfile-id=replicated_vol --volfile-server=serv1 /mnt/replicated_vol
root 31132 18322 0 23:04 pts/1 00:00:00 grep _opt_kapsch_cnp_data_memusage
root 31280 1 0 06:32 ? 00:09:10 /usr/sbin/glusterfsd -s serv1 --volfile-id replicated_vol.serv1.mnt-bricks-replicated_vol-brick -p /var/lib/glusterd/vols/replicated_vol/run/serv1-mnt-bricks-replicated_vol-brick.pid -S /var/run/4d70e99b47c1f95cc2eab1715d3a9b67.socket --brick-name /mnt/bricks/replicated_vol/brick -l /var/log/glusterfs/bricks/mnt-bricks-replicated_vol-bricks.log --xlator-option *-posix.glusterd-uuid=c7930be6-969f-4f62-b119-c5bbe4df22a3 --brick-port 49172 --xlator-option replicated_vol.listen-port=49172
root@serv1:/root> netstat -p | grep 30627
tcp 0 0 serv1:715 serv1:24007 ESTABLISHED 30627/glusterfs <= client<->local glusterd
tcp 0 0 serv1:863 serv1:49172 ESTABLISHED 30627/glusterfs <= client<->local brick
root@serv1:/root>
However, the client on the other server did have a port open to the mount, and so whatever one wrote on the other server synced over immediately.
root@serv0:/root> ps -ef | grep replicated_vol
root 12761 7556 0 23:05 pts/1 00:00:00 replicated_vol
root 15067 1 0 06:32 ? 00:04:50 /usr/sbin/glusterfsd -s serv1 --volfile-id replicated_vol.serv1.mnt-bricks-replicated_vol-brick -p /var/lib/glusterd/vols/replicated_vol/run/serv1-mnt-bricks-replicated_vol-brick.pid -S /var/run/f642d7dbff0ab7a475a23236f6f50b33.socket --brick-name /mnt/bricks/replicated_vol/brick -l /var/log/glusterfs/bricks/mnt-bricks-replicated_vol-bricks.log --xlator-option *-posix.glusterd-uuid=13df1bd2-6dc8-49fa-ade0-5cd95f6b1f19 --brick-port 49209 --xlator-option replicated_vol.listen-port=49209
root 30587 1 0 Jan30 ? 00:12:17 /usr/sbin/glusterfs --volfile-id=serv --volfile-server=serv0 /mnt/replicated_vol
root@serv0:/root> netstat -p | grep 30587
tcp 0 0 serv0:859 serv1:49172 ESTABLISHED 30587/glusterfs <= client<->remote brick
tcp 0 0 serv0:746 serv0:24007 ESTABLISHED 30587/glusterfs <= client<->glusterd
tcp 0 0 serv0:857 serv0:49209 ESTABLISHED 30587/glusterfs <= client<->local brick
root@serv0:/root>
So, the client has no open tcp link with the mate brick - which is why it cannot write to the mate brick directly, and instead has to rely on the self-heal daemon instead to do the job. Of course, I now need to debug why the connection fails, but at least we are clean on AFR.
Thanks everyone.
From: A Ghoshal <a.ghoshal@xxxxxxx>
To: gluster-users@xxxxxxxxxxx
Date: 02/03/2015 12:00 AM
Subject: A few queries on self-healing and AFR (glusterfs 3.4.2)
Sent by: gluster-users-bounces@xxxxxxxxxxx
Hello,
I have a replica-2 volume in which I store a large number of files that are updated frequently (critical log files, etc). My files are generally stable, but one thing that does worry me from time to time is that files show up on one of the bricks in the output of gluster v <volname> heal info. These entries disappear on their own after a while (I am guessing when cluster.heal-timeout expires and another heal by the self-heal daemon is triggered). For certain files, this could be a bit of a bother - in terms of fault tolerance...
I was wondering if there is a way I could force AFR to return write-completion to the application only _after_ the data is written to both replicas successfully (kind of, like, atomic writes) - even if it were at the cost of performance. This way I could ensure that my bricks shall always be in sync.
The other thing I could possibly do is reduce my cluster.heal-timeout (it is 600 currently). Is it a bad idea to set it to something as small as say, 60 seconds for volumes where redundancy is a prime concern?
One question, though - is heal through self-heal daemon accomplished using separate threads for each replicated volume, or is it a single thread for every volume? The reason I ask is I have a large number of replicated file-systems on each volume (17, to be precise) but I do have a reasonably powerful multicore processor array and large RAM and top indicates the load on the system resources is quite moderate.
Thanks,
Anirban
=====-----=====-----=====
Notice: The information contained in this e-mail
message and/or attachments to it may contain
confidential or privileged information. If you are
not the intended recipient, any dissemination, use,
review, distribution, printing or copying of the
information contained in this e-mail message
and/or attachments to it are strictly prohibited. If
you have received this communication in error,
please notify us by reply e-mail or telephone and
immediately and permanently delete the message
and any attachments. Thank you_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users