I try solve split-brain by gluster cli commands (on
directory from the output previous commands and on file),
but it could not help:
root@dist-gl2:/# gluster v heal repofiles split-brain
bigger-file /
Healing / failed:Operation not permitted.
Volume heal failed.
root@dist-gl2:/# gluster v heal repofiles split-brain
bigger-file /test
Lookup failed on /test:Input/output error
Volume heal failed.
root@dist-gl2:/# gluster v heal repofiles split-brain
source-brick dist-gl1:/brick1 /
Healing / failed:Operation not permitted.
Volume heal failed.
root@dist-gl2:/# gluster v heal repofiles split-brain
source-brick dist-gl1:/brick1 /test
Lookup failed on /test:Input/output error
Volume heal failed.
root@dist-gl2:/# gluster v heal repofiles split-brain
source-brick dist-gl2:/brick1 /
Healing / failed:Operation not permitted.
Volume heal failed.
root@dist-gl2:/# gluster v heal repofiles split-brain
source-brick dist-gl2:/brick1 /test
Lookup failed on /test:Input/output error
Volume heal failed.
root@dist-gl2:/#
Parts of glfsheal-repofiles.log logs.
When try to solve split-brain on dirictory ("/"):
[2015-07-15 19:45:30.508670] I
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll:
Started thread with index 1
[2015-07-15 19:45:30.516662] I
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll:
Started thread with index 2
[2015-07-15 19:45:30.517201] I [MSGID: 104045]
[glfs-master.c:95:notify] 0-gfapi: New graph
64697374-2d67-6c32-2d32-303634362d32 (0) coming up
[2015-07-15 19:45:30.517227] I [MSGID: 114020]
[client.c:2118:notify] 0-repofiles-client-0: parent
translators are ready, attempting connect on transport
[2015-07-15 19:45:30.525457] I [MSGID: 114020]
[client.c:2118:notify] 0-repofiles-client-1: parent
translators are ready, attempting connect on transport
[2015-07-15 19:45:30.526788] I
[rpc-clnt.c:1819:rpc_clnt_reconfig] 0-repofiles-client-0:
changing port to 49152 (from 0)
[2015-07-15 19:45:30.534012] I
[rpc-clnt.c:1819:rpc_clnt_reconfig] 0-repofiles-client-1:
changing port to 49152 (from 0)
[2015-07-15 19:45:30.536252] I [MSGID: 114057]
[client-handshake.c:1438:select_server_supported_programs]
0-repofiles-client-0: Using Program GlusterFS 3.3, Num
(1298437), Version (330)
[2015-07-15 19:45:30.536606] I [MSGID: 114046]
[client-handshake.c:1214:client_setvolume_cbk]
0-repofiles-client-0: Connected to repofiles-client-0,
attached to remote volume '/brick1'.
[2015-07-15 19:45:30.536621] I [MSGID: 114047]
[client-handshake.c:1225:client_setvolume_cbk]
0-repofiles-client-0: Server and Client lk-version numbers
are not same, reopening the fds
[2015-07-15 19:45:30.536679] I [MSGID: 108005]
[afr-common.c:3883:afr_notify] 0-repofiles-replicate-0:
Subvolume 'repofiles-client-0' came back up; going online.
[2015-07-15 19:45:30.536819] I [MSGID: 114035]
[client-handshake.c:193:client_set_lk_version_cbk]
0-repofiles-client-0: Server lk version = 1
[2015-07-15 19:45:30.543712] I [MSGID: 114057]
[client-handshake.c:1438:select_server_supported_programs]
0-repofiles-client-1: Using Program GlusterFS 3.3, Num
(1298437), Version (330)
[2015-07-15 19:45:30.543919] I [MSGID: 114046]
[client-handshake.c:1214:client_setvolume_cbk]
0-repofiles-client-1: Connected to repofiles-client-1,
attached to remote volume '/brick1'.
[2015-07-15 19:45:30.543933] I [MSGID: 114047]
[client-handshake.c:1225:client_setvolume_cbk]
0-repofiles-client-1: Server and Client lk-version numbers
are not same, reopening the fds
[2015-07-15 19:45:30.554650] I [MSGID: 114035]
[client-handshake.c:193:client_set_lk_version_cbk]
0-repofiles-client-1: Server lk version = 1
[2015-07-15 19:45:30.557628] I
[afr-self-heal-entry.c:565:afr_selfheal_entry_do]
0-repofiles-replicate-0: performing entry selfheal on
00000000-0000-0000-0000-000000000001
[2015-07-15 19:45:30.560002] E
[afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
0-repofiles-replicate-0: Gfid mismatch detected for
<00000000-0000-0000-0000-000000000001/test>,
e42d3f03-0633-4954-95ce-5cd8710e595e on repofiles-client-1
and 16da3178-8a6e-4010-b874-7f11449d1993 on
repofiles-client-0. Skipping conservative merge on the file.
[2015-07-15 19:45:30.561582] W
[afr-common.c:1985:afr_discover_done]
0-repofiles-replicate-0: no read subvols for /
[2015-07-15 19:45:30.561604] I
[afr-common.c:1673:afr_local_discovery_cbk]
0-repofiles-replicate-0: selecting local read_child
repofiles-client-1
[2015-07-15 19:45:30.561900] W
[afr-common.c:1985:afr_discover_done]
0-repofiles-replicate-0: no read subvols for /
[2015-07-15 19:45:30.561962] I [MSGID: 104041]
[glfs-resolve.c:843:__glfs_active_subvol] 0-repofiles:
switched to graph 64697374-2d67-6c32-2d32-303634362d32 (0)
[2015-07-15 19:45:30.562259] W
[afr-common.c:1985:afr_discover_done]
0-repofiles-replicate-0: no read subvols for /
[2015-07-15 19:45:32.563285] W
[afr-common.c:1985:afr_discover_done]
0-repofiles-replicate-0: no read subvols for /
[2015-07-15 19:45:32.564898] I
[afr-self-heal-entry.c:565:afr_selfheal_entry_do]
0-repofiles-replicate-0: performing entry selfheal on
00000000-0000-0000-0000-000000000001
[2015-07-15 19:45:32.566693] E
[afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
0-repofiles-replicate-0: Gfid mismatch detected for
<00000000-0000-0000-0000-000000000001/test>,
e42d3f03-0633-4954-95ce-5cd8710e595e on repofiles-client-1
and 16da3178-8a6e-4010-b874-7f11449d1993 on
repofiles-client-0. Skipping conservative merge on the file.
When try to solve split-brain on file ("/test"):
[2015-07-15 19:48:45.910819] I
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll:
Started thread with index 1
[2015-07-15 19:48:45.919854] I
[event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll:
Started thread with index 2
[2015-07-15 19:48:45.920434] I [MSGID: 104045]
[glfs-master.c:95:notify] 0-gfapi: New graph
64697374-2d67-6c32-2d32-313133392d32 (0) coming up
[2015-07-15 19:48:45.920481] I [MSGID: 114020]
[client.c:2118:notify] 0-repofiles-client-0: parent
translators are ready, attempting connect on transport
[2015-07-15 19:48:45.996442] I [MSGID: 114020]
[client.c:2118:notify] 0-repofiles-client-1: parent
translators are ready, attempting connect on transport
[2015-07-15 19:48:45.997892] I
[rpc-clnt.c:1819:rpc_clnt_reconfig] 0-repofiles-client-0:
changing port to 49152 (from 0)
[2015-07-15 19:48:46.005153] I
[rpc-clnt.c:1819:rpc_clnt_reconfig] 0-repofiles-client-1:
changing port to 49152 (from 0)
[2015-07-15 19:48:46.007437] I [MSGID: 114057]
[client-handshake.c:1438:select_server_supported_programs]
0-repofiles-client-0: Using Program GlusterFS 3.3, Num
(1298437), Version (330)
[2015-07-15 19:48:46.007928] I [MSGID: 114046]
[client-handshake.c:1214:client_setvolume_cbk]
0-repofiles-client-0: Connected to repofiles-client-0,
attached to remote volume '/brick1'.
[2015-07-15 19:48:46.007945] I [MSGID: 114047]
[client-handshake.c:1225:client_setvolume_cbk]
0-repofiles-client-0: Server and Client lk-version numbers
are not same, reopening the fds
[2015-07-15 19:48:46.008020] I [MSGID: 108005]
[afr-common.c:3883:afr_notify] 0-repofiles-replicate-0:
Subvolume 'repofiles-client-0' came back up; going online.
[2015-07-15 19:48:46.008189] I [MSGID: 114035]
[client-handshake.c:193:client_set_lk_version_cbk]
0-repofiles-client-0: Server lk version = 1
[2015-07-15 19:48:46.014313] I [MSGID: 114057]
[client-handshake.c:1438:select_server_supported_programs]
0-repofiles-client-1: Using Program GlusterFS 3.3, Num
(1298437), Version (330)
[2015-07-15 19:48:46.014536] I [MSGID: 114046]
[client-handshake.c:1214:client_setvolume_cbk]
0-repofiles-client-1: Connected to repofiles-client-1,
attached to remote volume '/brick1'.
[2015-07-15 19:48:46.014550] I [MSGID: 114047]
[client-handshake.c:1225:client_setvolume_cbk]
0-repofiles-client-1: Server and Client lk-version numbers
are not same, reopening the fds
[2015-07-15 19:48:46.026828] I [MSGID: 114035]
[client-handshake.c:193:client_set_lk_version_cbk]
0-repofiles-client-1: Server lk version = 1
[2015-07-15 19:48:46.029357] I
[afr-self-heal-entry.c:565:afr_selfheal_entry_do]
0-repofiles-replicate-0: performing entry selfheal on
00000000-0000-0000-0000-000000000001
[2015-07-15 19:48:46.031719] E
[afr-self-heal-entry.c:249:afr_selfheal_detect_gfid_and_type_mismatch]
0-repofiles-replicate-0: Gfid mismatch detected for
<00000000-0000-0000-0000-000000000001/test>,
e42d3f03-0633-4954-95ce-5cd8710e595e on repofiles-client-1
and 16da3178-8a6e-4010-b874-7f11449d1993 on
repofiles-client-0. Skipping conservative merge on the file.
[2015-07-15 19:48:46.033222] W
[afr-common.c:1985:afr_discover_done]
0-repofiles-replicate-0: no read subvols for /
[2015-07-15 19:48:46.033224] I
[afr-common.c:1673:afr_local_discovery_cbk]
0-repofiles-replicate-0: selecting local read_child
repofiles-client-1
[2015-07-15 19:48:46.033569] W
[afr-common.c:1985:afr_discover_done]
0-repofiles-replicate-0: no read subvols for /
[2015-07-15 19:48:46.033624] I [MSGID: 104041]
[glfs-resolve.c:843:__glfs_active_subvol] 0-repofiles:
switched to graph 64697374-2d67-6c32-2d32-313133392d32 (0)
[2015-07-15 19:48:46.033906] W
[afr-common.c:1985:afr_discover_done]
0-repofiles-replicate-0: no read subvols for /
[2015-07-15 19:48:48.036482] W [MSGID: 108008]
[afr-self-heal-name.c:353:afr_selfheal_name_gfid_mismatch_check]
0-repofiles-replicate-0: GFID mismatch for
<gfid:00000000-0000-0000-0000-000000000001>/test
e42d3f03-0633-4954-95ce-5cd8710e595e on repofiles-client-1
and 16da3178-8a6e-4010-b874-7f11449d1993 on
repofiles-client-0
Where I did mistake when try solve split-brain?
Best regards,