I have mounted the halo glusterfs volume in debug mode, and the output is as follows:
.
.
.
[2018-02-05 11:42:48.282473] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-test-halo-client-1: Ping latency is 0ms
[2018-02-05 11:42:48.282502] D [MSGID: 0] [afr-common.c:5025:afr_get_halo_latency] 0-test-halo-replicate-0: Using halo latency 10
[2018-02-05 11:42:48.282525] D [MSGID: 0] [afr-common.c:4820:__afr_handle_ping_event] 0-test-halo-client-1: Client ping @ 140032933708544 ms
.
.
.
[2018-02-05 11:42:48.393776] D [MSGID: 0] [afr-common.c:4803:find_worst_up_child] 0-test-halo-replicate-0: Found worst up child (1) @ 140032933708544 ms latency
[2018-02-05 11:42:48.393803] D [MSGID: 0] [afr-common.c:4903:__afr_handle_child_up_event] 0-test-halo-replicate-0: Marking child 1 down, doesn't meet halo threshold (10), and > halo_min_replicas (2)
.
.
.
.
.
.
[2018-02-05 11:42:48.282473] D [rpc-clnt-ping.c:211:rpc_clnt_ping_cbk] 0-test-halo-client-1: Ping latency is 0ms
[2018-02-05 11:42:48.282502] D [MSGID: 0] [afr-common.c:5025:afr_get_halo_latency] 0-test-halo-replicate-0: Using halo latency 10
[2018-02-05 11:42:48.282525] D [MSGID: 0] [afr-common.c:4820:__afr_handle_ping_event] 0-test-halo-client-1: Client ping @ 140032933708544 ms
.
.
.
[2018-02-05 11:42:48.393776] D [MSGID: 0] [afr-common.c:4803:find_worst_up_child] 0-test-halo-replicate-0: Found worst up child (1) @ 140032933708544 ms latency
[2018-02-05 11:42:48.393803] D [MSGID: 0] [afr-common.c:4903:__afr_handle_child_up_event] 0-test-halo-replicate-0: Marking child 1 down, doesn't meet halo threshold (10), and > halo_min_replicas (2)
.
.
.
I think these debug output means:
As the ping time for test-halo-client-1 (brick2) is (0.5ms) and it is not under halo threshold (10 ms), this false decision for selecting bricks happen to halo.#gluster vol set test-halo cluster.halo-max-latency 0
volume set: failed: '0' in 'option halo-max-latency 0' is out of range [1 - 99999]
On Sun, Feb 4, 2018 at 2:27 PM, atris adam <atris.adam@xxxxxxxxx> wrote:
I have 2 data centers in two different region, each DC have 3 severs, I have created glusterfs volume with 4 replica, this is glusterfs volume info output:Volume Name: test-haloType: ReplicateStatus: StartedSnapshot Count: 0Number of Bricks: 1 x 4 = 4Transport-type: tcpBricks:Brick1: 10.0.0.1:/mnt/test1Brick2: 10.0.0.3:/mnt/test2Brick3: 10.0.0.5:/mnt/test3Brick4: 10.0.0.6:/mnt/test4Options Reconfigured:cluster.halo-shd-max-latency: 5cluster.halo-max-latency: 10cluster.quorum-count: 2cluster.quorum-type: fixedcluster.halo-enabled: yestransport.address-family: inetnfs.disable: onbricks with ip 10.0.0.1 & 10.0.0.3 are in region A and bricks with ip 10.0.0.5 & 10.0.0.6 are in region Bwhen I mount the volume in region A, I except the data first store in brick1 & brick2, then asynchronously the data copies in region B, on brick3 & brick4.Am I write? this is what halo claims?If yes, unfortunately, this not happen to me, no differ I mount the volume in region A or mount the volume in region B, all the data are copied in brick3 & brick4 and no data copies in brick1 & brick2.ping bricks ip from region A is as follows:ping 10.0.0.1 & 10.0.0.3 are bellow time=0.500 msping 10.0.0.5 & 10.0.0.6 are more than time=20 msWhat is the logic that the halo select the bricks to write to?if it is the access time, so when I mount the volume in region A, the ping time to brick1 & brick2 is bellow 0.5 ms, but the halo select the brick3 & brick4!!!!glusterfs version is:glusterfs 3.12.4I really need to work with halo feature, But I am not successful to run this case, Can anyone help me soon??Thx alot
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users