Hi, I have been running an glusterfs for a while, and everything works
just fine even after one node failure.. However, I went for brick
replacement due to my bricks were not thin-provisioned and I wanted to
use snapshots. In short, whole volume went down due to heal daemon which
tool all IO and all VMs running on top of that volume started to be
unresponsive.
In short, I am rebuilding the volume from scratch. I created new thinly
provisioned bricks:
lvs:
brick_s3-sata-10k vg_s3-sata-10k Vwi-aotz 931,25g
s3-sata-10k_pool 2,95
s3-sata-10k_pool vg_s3-sata-10k twi-a-tz 931,25g
vgs:
vg_s3-sata-10k 1 3 0 wz--n- 931,51g 148,00m
df:
/dev/mapper/vg_s3--sata--10k-brick_s3--sata--10k 976009600 28383480
947626120 3% /gfs/s3-sata-10k
and mounted. When I uploaded two images onto it I found there might be a
problem. For the time being I run the volume in replica 2 mode on top of
two servers. The files were copied from node1, and I think the files are
OK on node1 only. However, the volume heal indicates everything is OK.
My symptoms are as follows:
df information from both servers:
/dev/mapper/vg_s3--sata--10k-brick_s3--sata--10k 976009600 30754296
945255304 4% /gfs/s3-sata-10k
/dev/mapper/vg_s3--sata--10k-brick_s3--sata--10k 976009600 28383480
947626120 3% /gfs/s3-sata-10k
[root@nodef01i ~]# du /gfs/s3-sata-10k/
0 /gfs/s3-sata-10k/fs/.glusterfs/indices/xattrop
0 /gfs/s3-sata-10k/fs/.glusterfs/indices
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/htime
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/csnap
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs
0 /gfs/s3-sata-10k/fs/.glusterfs/00/00
0 /gfs/s3-sata-10k/fs/.glusterfs/00
0 /gfs/s3-sata-10k/fs/.glusterfs/landfill
20480004 /gfs/s3-sata-10k/fs/.glusterfs/84/26
20480004 /gfs/s3-sata-10k/fs/.glusterfs/84
10240000 /gfs/s3-sata-10k/fs/.glusterfs/d0/ff
10240000 /gfs/s3-sata-10k/fs/.glusterfs/d0
30720008 /gfs/s3-sata-10k/fs/.glusterfs
30720008 /gfs/s3-sata-10k/fs
30720008 /gfs/s3-sata-10k/
[root@nodef02i ~]# du /gfs/s3-sata-10k/
0 /gfs/s3-sata-10k/fs/.glusterfs/indices/xattrop
0 /gfs/s3-sata-10k/fs/.glusterfs/indices
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/htime
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs/csnap
0 /gfs/s3-sata-10k/fs/.glusterfs/changelogs
0 /gfs/s3-sata-10k/fs/.glusterfs/00/00
0 /gfs/s3-sata-10k/fs/.glusterfs/00
0 /gfs/s3-sata-10k/fs/.glusterfs/landfill
18727172 /gfs/s3-sata-10k/fs/.glusterfs/84/26
18727172 /gfs/s3-sata-10k/fs/.glusterfs/84
9622016 /gfs/s3-sata-10k/fs/.glusterfs/d0/ff
9622016 /gfs/s3-sata-10k/fs/.glusterfs/d0
28349192 /gfs/s3-sata-10k/fs/.glusterfs
28349192 /gfs/s3-sata-10k/fs
28349192 /gfs/s3-sata-10k/
[root@nodef01i ~]# du /gfs/s3-sata-10k/fs/*
20480004 /gfs/s3-sata-10k/fs/f1607f25aa52f4fb6f98f20ef0f3f9d7
10240000 /gfs/s3-sata-10k/fs/3706a2cb0bb27ba5787b3c12388f4ebb
[root@nodef02i ~]# du /gfs/s3-sata-10k/fs/*
18727172 /gfs/s3-sata-10k/fs/f1607f25aa52f4fb6f98f20ef0f3f9d7
9622016 /gfs/s3-sata-10k/fs/3706a2cb0bb27ba5787b3c12388f4ebb
[root@nodef01i ~]# ll /gfs/s3-sata-10k/fs/
celkem 30720004
-rw-r----- 2 oneadmin oneadmin 20971520512 3. srp 23.53
f1607f25aa52f4fb6f98f20ef0f3f9d7
-rw-r----- 2 oneadmin oneadmin 10485760000 16. srp 11.23
3706a2cb0bb27ba5787b3c12388f4ebb
[root@nodef02i ~]# ll /gfs/s3-sata-10k/fs/
celkem 28349188
-rw-r----- 2 oneadmin oneadmin 20971520512 3. srp 23.53
f1607f25aa52f4fb6f98f20ef0f3f9d7
-rw-r----- 2 oneadmin oneadmin 10485760000 16. srp 11.22
3706a2cb0bb27ba5787b3c12388f4ebb
[root@nodef01i ~]# gluster volume heal ph-fs-0 info split-brain
Gathering list of split brain entries on volume ph-fs-0 has been successful
Brick 10.11.100.1:/gfs/s3-sata-10k/fs
Number of entries: 0
Brick 10.11.100.2:/gfs/s3-sata-10k/fs
Number of entries: 0
[root@nodef01i ~]# gluster volume heal ph-fs-0 info
Brick nodef01i.czprg:/gfs/s3-sata-10k/fs/
Number of entries: 0
Brick nodef02i.czprg:/gfs/s3-sata-10k/fs/
Number of entries: 0
[root@nodef01i ~]# gluster volume status
Status of volume: ph-fs-0
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 10.11.100.1:/gfs/s3-sata-10k/fs 49152 Y 3733
Brick 10.11.100.2:/gfs/s3-sata-10k/fs 49152 Y 64711
NFS Server on localhost 2049 Y 3747
Self-heal Daemon on localhost N/A Y 3752
NFS Server on 10.11.100.2 2049 Y 64725
Self-heal Daemon on 10.11.100.2 N/A Y 64730
Task Status of Volume ph-fs-0
------------------------------------------------------------------------------
There are no active volume tasks
[root@nodef02i ~]# gluster volume status
Status of volume: ph-fs-0
Gluster process Port Online Pid
------------------------------------------------------------------------------
Brick 10.11.100.1:/gfs/s3-sata-10k/fs 49152 Y 3733
Brick 10.11.100.2:/gfs/s3-sata-10k/fs 49152 Y 64711
NFS Server on localhost 2049 Y 64725
Self-heal Daemon on localhost N/A Y 64730
NFS Server on 10.11.100.1 2049 Y 3747
Self-heal Daemon on 10.11.100.1 N/A Y 3752
Task Status of Volume ph-fs-0
------------------------------------------------------------------------------
There are no active volume tasks
[root@nodef02i ~]# rpm -qa | grep gluster
glusterfs-server-3.6.2-1.el6.x86_64
glusterfs-3.6.2-1.el6.x86_64
glusterfs-api-3.6.2-1.el6.x86_64
glusterfs-libs-3.6.2-1.el6.x86_64
glusterfs-cli-3.6.2-1.el6.x86_64
glusterfs-fuse-3.6.2-1.el6.x86_64
What other information should I provide?
Thanks Milos
[2015-08-02 20:57:02.091770] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-08-02 20:57:02.175308] W [socket.c:611:__socket_rwv] 0-ph-fs-0-client-0: readv on 10.11.100.1:49152 failed (Data nejsou k dispozici)
[2015-08-02 20:57:02.175438] I [client.c:2215:client_rpc_notify] 0-ph-fs-0-client-0: disconnected from ph-fs-0-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2015-08-02 20:57:04.549669] I [glusterfsd-mgmt.c:1504:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-08-02 20:57:12.531944] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-ph-fs-0-client-0: changing port to 49152 (from 0)
[2015-08-02 20:57:12.538004] I [client-handshake.c:1413:select_server_supported_programs] 0-ph-fs-0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-08-02 20:57:12.538423] I [client-handshake.c:1200:client_setvolume_cbk] 0-ph-fs-0-client-0: Connected to ph-fs-0-client-0, attached to remote volume '/gfs/s3-sata-10k/fs'.
[2015-08-02 20:57:12.538455] I [client-handshake.c:1210:client_setvolume_cbk] 0-ph-fs-0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2015-08-02 20:57:12.538577] I [client-handshake.c:188:client_set_lk_version_cbk] 0-ph-fs-0-client-0: Server lk version = 1
[2015-08-03 08:03:15.253536] W [socket.c:611:__socket_rwv] 0-glusterfs: readv on 127.0.0.1:24007 failed (Data nejsou k dispozici)
[2015-08-03 08:03:25.579086] I [glusterfsd-mgmt.c:1504:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-08-03 08:05:55.229913] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15), shutting down
[2015-08-03 08:05:56.208873] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/4746ba0e3011181c6105e61a38cab79e.socket --xlator-option *replicate*.node-uuid=9020ef8e-5d56-4a8c-8716-fed4f1348f30)
[2015-08-03 08:05:56.281563] I [graph.c:269:gf_add_cmdline_options] 0-ph-fs-0-replicate-0: adding option 'node-uuid' for volume 'ph-fs-0-replicate-0' with value '9020ef8e-5d56-4a8c-8716-fed4f1348f30'
[2015-08-03 08:05:56.286248] I [client.c:2280:notify] 0-ph-fs-0-client-0: parent translators are ready, attempting connect on transport
[2015-08-03 08:05:56.291983] I [client.c:2280:notify] 0-ph-fs-0-client-1: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
1: volume ph-fs-0-client-0
2: type protocol/client
3: option ping-timeout 5
4: option remote-host 10.11.100.1
5: option remote-subvolume /gfs/s3-sata-10k/fs
6: option transport-type socket
7: option username f46fcab9-8c58-4006-9871-1d7a3b02949a
8: option password 48151685-a3b0-43e7-b0fe-e9eb19897764
9: end-volume
10:
11: volume ph-fs-0-client-1
12: type protocol/client
13: option ping-timeout 5
14: option remote-host 10.11.100.2
15: option remote-subvolume /gfs/s3-sata-10k/fs
16: option transport-type socket
17: option username f46fcab9-8c58-4006-9871-1d7a3b02949a
18: option password 48151685-a3b0-43e7-b0fe-e9eb19897764
19: end-volume
20:
21: volume ph-fs-0-replicate-0
22: type cluster/replicate
23: option node-uuid 9020ef8e-5d56-4a8c-8716-fed4f1348f30
24: option background-self-heal-count 0
25: option metadata-self-heal on
26: option data-self-heal on
27: option entry-self-heal on
28: option self-heal-daemon on
29: option iam-self-heal-daemon yes
30: subvolumes ph-fs-0-client-0 ph-fs-0-client-1
31: end-volume
32:
33: volume glustershd
34: type debug/io-stats
35: subvolumes ph-fs-0-replicate-0
36: end-volume
37:
+------------------------------------------------------------------------------+
[2015-08-03 08:05:56.298459] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-ph-fs-0-client-0: changing port to 49152 (from 0)
[2015-08-03 08:05:56.304339] I [client-handshake.c:1413:select_server_supported_programs] 0-ph-fs-0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-08-03 08:05:56.304817] I [client-handshake.c:1200:client_setvolume_cbk] 0-ph-fs-0-client-0: Connected to ph-fs-0-client-0, attached to remote volume '/gfs/s3-sata-10k/fs'.
[2015-08-03 08:05:56.304851] I [client-handshake.c:1210:client_setvolume_cbk] 0-ph-fs-0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2015-08-03 08:05:56.304946] I [MSGID: 108005] [afr-common.c:3552:afr_notify] 0-ph-fs-0-replicate-0: Subvolume 'ph-fs-0-client-0' came back up; going online.
[2015-08-03 08:05:56.305009] I [client-handshake.c:188:client_set_lk_version_cbk] 0-ph-fs-0-client-0: Server lk version = 1
[2015-08-03 08:05:56.405884] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-ph-fs-0-client-1: changing port to 49152 (from 0)
[2015-08-03 08:05:56.411872] I [client-handshake.c:1413:select_server_supported_programs] 0-ph-fs-0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-08-03 08:05:56.433809] I [client-handshake.c:1200:client_setvolume_cbk] 0-ph-fs-0-client-1: Connected to ph-fs-0-client-1, attached to remote volume '/gfs/s3-sata-10k/fs'.
[2015-08-03 08:05:56.438515] I [client-handshake.c:1210:client_setvolume_cbk] 0-ph-fs-0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2015-08-03 08:05:56.439370] I [client-handshake.c:188:client_set_lk_version_cbk] 0-ph-fs-0-client-1: Server lk version = 1
[2015-08-03 08:59:09.818197] W [socket.c:611:__socket_rwv] 0-glusterfs: readv on 127.0.0.1:24007 failed (Data nejsou k dispozici)
[2015-08-03 08:59:20.590909] E [socket.c:2267:socket_connect_finish] 0-glusterfs: connection to 127.0.0.1:24007 failed (Spojenà odmÃtnuto)
[2015-08-03 19:39:22.364177] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15), shutting down
[2015-08-03 19:39:23.374128] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/4746ba0e3011181c6105e61a38cab79e.socket --xlator-option *replicate*.node-uuid=9020ef8e-5d56-4a8c-8716-fed4f1348f30)
[2015-08-03 19:39:23.484979] I [graph.c:269:gf_add_cmdline_options] 0-ph-fs-0-replicate-0: adding option 'node-uuid' for volume 'ph-fs-0-replicate-0' with value '9020ef8e-5d56-4a8c-8716-fed4f1348f30'
[2015-08-03 19:39:23.489502] I [client.c:2280:notify] 0-ph-fs-0-client-0: parent translators are ready, attempting connect on transport
[2015-08-03 19:39:23.495182] I [client.c:2280:notify] 0-ph-fs-0-client-1: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
1: volume ph-fs-0-client-0
2: type protocol/client
3: option ping-timeout 5
4: option remote-host 10.11.100.1
5: option remote-subvolume /gfs/s3-sata-10k/fs
6: option transport-type socket
7: option username f46fcab9-8c58-4006-9871-1d7a3b02949a
8: option password 48151685-a3b0-43e7-b0fe-e9eb19897764
9: end-volume
10:
11: volume ph-fs-0-client-1
12: type protocol/client
13: option ping-timeout 5
14: option remote-host 10.11.100.2
15: option remote-subvolume /gfs/s3-sata-10k/fs
16: option transport-type socket
17: option username f46fcab9-8c58-4006-9871-1d7a3b02949a
18: option password 48151685-a3b0-43e7-b0fe-e9eb19897764
19: end-volume
20:
21: volume ph-fs-0-replicate-0
22: type cluster/replicate
23: option node-uuid 9020ef8e-5d56-4a8c-8716-fed4f1348f30
24: option background-self-heal-count 0
25: option metadata-self-heal on
26: option data-self-heal on
27: option entry-self-heal on
28: option self-heal-daemon on
29: option iam-self-heal-daemon yes
30: subvolumes ph-fs-0-client-0 ph-fs-0-client-1
31: end-volume
32:
33: volume glustershd
34: type debug/io-stats
35: subvolumes ph-fs-0-replicate-0
36: end-volume
37:
+------------------------------------------------------------------------------+
[2015-08-03 19:39:23.596062] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-ph-fs-0-client-1: changing port to 49152 (from 0)
[2015-08-03 19:39:23.602490] I [client-handshake.c:1413:select_server_supported_programs] 0-ph-fs-0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-08-03 19:39:23.603008] I [client-handshake.c:1200:client_setvolume_cbk] 0-ph-fs-0-client-1: Connected to ph-fs-0-client-1, attached to remote volume '/gfs/s3-sata-10k/fs'.
[2015-08-03 19:39:23.603063] I [client-handshake.c:1210:client_setvolume_cbk] 0-ph-fs-0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2015-08-03 19:39:23.603208] I [MSGID: 108005] [afr-common.c:3552:afr_notify] 0-ph-fs-0-replicate-0: Subvolume 'ph-fs-0-client-1' came back up; going online.
[2015-08-03 19:39:23.603301] I [client-handshake.c:188:client_set_lk_version_cbk] 0-ph-fs-0-client-1: Server lk version = 1
[2015-08-03 19:39:23.612396] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-ph-fs-0-client-0: changing port to 49152 (from 0)
[2015-08-03 19:39:23.618284] I [client-handshake.c:1413:select_server_supported_programs] 0-ph-fs-0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-08-03 19:39:23.623400] I [client-handshake.c:1200:client_setvolume_cbk] 0-ph-fs-0-client-0: Connected to ph-fs-0-client-0, attached to remote volume '/gfs/s3-sata-10k/fs'.
[2015-08-03 19:39:23.623444] I [client-handshake.c:1210:client_setvolume_cbk] 0-ph-fs-0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2015-08-03 19:39:23.625460] I [client-handshake.c:188:client_set_lk_version_cbk] 0-ph-fs-0-client-0: Server lk version = 1
[2015-08-03 21:08:12.742540] W [socket.c:611:__socket_rwv] 0-glusterfs: readv on 127.0.0.1:24007 failed (Data nejsou k dispozici)
[2015-08-03 21:08:23.069474] E [socket.c:2267:socket_connect_finish] 0-glusterfs: connection to 127.0.0.1:24007 failed (Spojenà odmÃtnuto)
[2015-08-03 21:09:46.484966] W [socket.c:611:__socket_rwv] 0-ph-fs-0-client-0: readv on 10.11.100.1:49152 failed (Data nejsou k dispozici)
[2015-08-03 21:09:46.485131] I [client.c:2215:client_rpc_notify] 0-ph-fs-0-client-0: disconnected from ph-fs-0-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2015-08-03 21:09:57.346042] E [socket.c:2267:socket_connect_finish] 0-ph-fs-0-client-0: connection to 10.11.100.1:24007 failed (Spojenà odmÃtnuto)
[2015-08-03 21:11:00.522287] W [socket.c:611:__socket_rwv] 0-ph-fs-0-client-1: readv on 10.11.100.2:49152 failed (Data nejsou k dispozici)
[2015-08-03 21:11:00.522398] I [client.c:2215:client_rpc_notify] 0-ph-fs-0-client-1: disconnected from ph-fs-0-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2015-08-03 21:11:00.522472] E [MSGID: 108006] [afr-common.c:3591:afr_notify] 0-ph-fs-0-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-08-03 21:11:10.692572] E [socket.c:2267:socket_connect_finish] 0-ph-fs-0-client-1: connection to 10.11.100.2:24007 failed (Spojenà odmÃtnuto)
[2015-08-03 21:11:56.078533] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15), shutting down
[2015-08-03 21:32:52.827862] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/4746ba0e3011181c6105e61a38cab79e.socket --xlator-option *replicate*.node-uuid=9020ef8e-5d56-4a8c-8716-fed4f1348f30)
[2015-08-03 21:32:54.072291] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15), shutting down
[2015-08-03 21:32:55.083431] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/4746ba0e3011181c6105e61a38cab79e.socket --xlator-option *replicate*.node-uuid=9020ef8e-5d56-4a8c-8716-fed4f1348f30)
[2015-08-03 21:32:55.113370] I [graph.c:269:gf_add_cmdline_options] 0-ph-fs-0-replicate-0: adding option 'node-uuid' for volume 'ph-fs-0-replicate-0' with value '9020ef8e-5d56-4a8c-8716-fed4f1348f30'
[2015-08-03 21:32:55.118127] I [client.c:2280:notify] 0-ph-fs-0-client-0: parent translators are ready, attempting connect on transport
[2015-08-03 21:32:55.123759] I [client.c:2280:notify] 0-ph-fs-0-client-1: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
1: volume ph-fs-0-client-0
2: type protocol/client
3: option ping-timeout 5
4: option remote-host 10.11.100.1
5: option remote-subvolume /gfs/s3-sata-10k/fs
6: option transport-type socket
7: option username f46fcab9-8c58-4006-9871-1d7a3b02949a
8: option password 48151685-a3b0-43e7-b0fe-e9eb19897764
9: end-volume
10:
11: volume ph-fs-0-client-1
12: type protocol/client
13: option ping-timeout 5
14: option remote-host 10.11.100.2
15: option remote-subvolume /gfs/s3-sata-10k/fs
16: option transport-type socket
17: option username f46fcab9-8c58-4006-9871-1d7a3b02949a
18: option password 48151685-a3b0-43e7-b0fe-e9eb19897764
19: end-volume
20:
21: volume ph-fs-0-replicate-0
22: type cluster/replicate
23: option node-uuid 9020ef8e-5d56-4a8c-8716-fed4f1348f30
24: option background-self-heal-count 0
25: option metadata-self-heal on
26: option data-self-heal on
27: option entry-self-heal on
28: option self-heal-daemon on
29: option iam-self-heal-daemon yes
30: subvolumes ph-fs-0-client-0 ph-fs-0-client-1
31: end-volume
32:
33: volume glustershd
34: type debug/io-stats
35: subvolumes ph-fs-0-replicate-0
36: end-volume
37:
+------------------------------------------------------------------------------+
[2015-08-03 21:32:55.132476] E [client-handshake.c:1496:client_query_portmap_cbk] 0-ph-fs-0-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-08-03 21:32:55.132567] I [client.c:2215:client_rpc_notify] 0-ph-fs-0-client-1: disconnected from ph-fs-0-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2015-08-03 21:32:55.136621] E [client-handshake.c:1496:client_query_portmap_cbk] 0-ph-fs-0-client-0: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-08-03 21:32:55.136718] I [client.c:2215:client_rpc_notify] 0-ph-fs-0-client-0: disconnected from ph-fs-0-client-0. Client process will keep trying to connect to glusterd until brick's port is available
[2015-08-03 21:32:55.136748] E [MSGID: 108006] [afr-common.c:3591:afr_notify] 0-ph-fs-0-replicate-0: All subvolumes are down. Going offline until atleast one of them comes back up.
[2015-08-03 21:36:37.388928] W [socket.c:611:__socket_rwv] 0-glusterfs: readv on 127.0.0.1:24007 failed (Data nejsou k dispozici)
[2015-08-03 21:36:38.938833] E [socket.c:2267:socket_connect_finish] 0-ph-fs-0-client-0: connection to 10.11.100.1:24007 failed (Spojenà odmÃtnuto)
[2015-08-03 21:36:47.973367] E [socket.c:2267:socket_connect_finish] 0-glusterfs: connection to 127.0.0.1:24007 failed (Spojenà odmÃtnuto)
[2015-08-03 21:36:51.007615] E [socket.c:2267:socket_connect_finish] 0-ph-fs-0-client-1: connection to 10.11.100.2:24007 failed (Spojenà odmÃtnuto)
[2015-08-03 21:37:06.303666] W [glusterfsd.c:1194:cleanup_and_exit] (--> 0-: received signum (15), shutting down
[2015-08-03 21:45:20.938143] I [MSGID: 100030] [glusterfsd.c:2018:main] 0-/usr/sbin/glusterfs: Started running /usr/sbin/glusterfs version 3.6.2 (args: /usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p /var/lib/glusterd/glustershd/run/glustershd.pid -l /var/log/glusterfs/glustershd.log -S /var/run/018717d81d4b6b13483d7d5d2a35f9b0.socket --xlator-option *replicate*.node-uuid=db8524a9-dabc-4be0-8ac1-6357e9fd576d)
[2015-08-03 21:45:20.956970] I [graph.c:269:gf_add_cmdline_options] 0-ph-fs-0-replicate-0: adding option 'node-uuid' for volume 'ph-fs-0-replicate-0' with value 'db8524a9-dabc-4be0-8ac1-6357e9fd576d'
[2015-08-03 21:45:20.962361] I [client.c:2280:notify] 0-ph-fs-0-client-0: parent translators are ready, attempting connect on transport
[2015-08-03 21:45:20.968270] I [client.c:2280:notify] 0-ph-fs-0-client-1: parent translators are ready, attempting connect on transport
Final graph:
+------------------------------------------------------------------------------+
1: volume ph-fs-0-client-0
2: type protocol/client
3: option ping-timeout 42
4: option remote-host 10.11.100.1
5: option remote-subvolume /gfs/s3-sata-10k/fs
6: option transport-type socket
7: option username 4e8b78d0-0c54-4eaf-b0e0-75985351f8e3
8: option password df8a4093-a435-48cd-80ac-22c1807c2320
9: end-volume
10:
11: volume ph-fs-0-client-1
12: type protocol/client
13: option ping-timeout 42
14: option remote-host 10.11.100.2
15: option remote-subvolume /gfs/s3-sata-10k/fs
16: option transport-type socket
17: option username 4e8b78d0-0c54-4eaf-b0e0-75985351f8e3
18: option password df8a4093-a435-48cd-80ac-22c1807c2320
19: end-volume
20:
21: volume ph-fs-0-replicate-0
22: type cluster/replicate
23: option node-uuid db8524a9-dabc-4be0-8ac1-6357e9fd576d
24: option background-self-heal-count 0
25: option metadata-self-heal on
26: option data-self-heal on
27: option entry-self-heal on
28: option self-heal-daemon on
29: option iam-self-heal-daemon yes
30: subvolumes ph-fs-0-client-0 ph-fs-0-client-1
31: end-volume
32:
33: volume glustershd
34: type debug/io-stats
35: subvolumes ph-fs-0-replicate-0
36: end-volume
37:
+------------------------------------------------------------------------------+
[2015-08-03 21:45:20.975282] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-ph-fs-0-client-0: changing port to 49152 (from 0)
[2015-08-03 21:45:20.981541] I [client-handshake.c:1413:select_server_supported_programs] 0-ph-fs-0-client-0: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-08-03 21:45:20.987792] I [client-handshake.c:1200:client_setvolume_cbk] 0-ph-fs-0-client-0: Connected to ph-fs-0-client-0, attached to remote volume '/gfs/s3-sata-10k/fs'.
[2015-08-03 21:45:20.987837] I [client-handshake.c:1210:client_setvolume_cbk] 0-ph-fs-0-client-0: Server and Client lk-version numbers are not same, reopening the fds
[2015-08-03 21:45:20.987957] I [MSGID: 108005] [afr-common.c:3552:afr_notify] 0-ph-fs-0-replicate-0: Subvolume 'ph-fs-0-client-0' came back up; going online.
[2015-08-03 21:45:20.988018] I [client-handshake.c:188:client_set_lk_version_cbk] 0-ph-fs-0-client-0: Server lk version = 1
[2015-08-03 21:45:21.099909] E [client-handshake.c:1496:client_query_portmap_cbk] 0-ph-fs-0-client-1: failed to get the port number for remote subvolume. Please run 'gluster volume status' on server to see if brick process is running.
[2015-08-03 21:45:21.100011] I [client.c:2215:client_rpc_notify] 0-ph-fs-0-client-1: disconnected from ph-fs-0-client-1. Client process will keep trying to connect to glusterd until brick's port is available
[2015-08-03 21:45:22.039032] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-08-03 21:45:22.338512] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-08-03 21:45:22.339588] I [glusterfsd-mgmt.c:56:mgmt_cbk_spec] 0-mgmt: Volume file changed
[2015-08-03 21:45:22.343040] I [graph.c:269:gf_add_cmdline_options] 0-ph-fs-0-replicate-0: adding option 'node-uuid' for volume 'ph-fs-0-replicate-0' with value 'db8524a9-dabc-4be0-8ac1-6357e9fd576d'
[2015-08-03 21:45:22.344644] I [glusterfsd-mgmt.c:1504:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-08-03 21:45:22.347043] I [glusterfsd-mgmt.c:1504:mgmt_getspec_cbk] 0-glusterfs: No change in volfile, continuing
[2015-08-03 21:45:24.945900] I [rpc-clnt.c:1761:rpc_clnt_reconfig] 0-ph-fs-0-client-1: changing port to 49152 (from 0)
[2015-08-03 21:45:24.952609] I [client-handshake.c:1413:select_server_supported_programs] 0-ph-fs-0-client-1: Using Program GlusterFS 3.3, Num (1298437), Version (330)
[2015-08-03 21:45:24.953042] I [client-handshake.c:1200:client_setvolume_cbk] 0-ph-fs-0-client-1: Connected to ph-fs-0-client-1, attached to remote volume '/gfs/s3-sata-10k/fs'.
[2015-08-03 21:45:24.953096] I [client-handshake.c:1210:client_setvolume_cbk] 0-ph-fs-0-client-1: Server and Client lk-version numbers are not same, reopening the fds
[2015-08-03 21:45:24.953342] I [client-handshake.c:188:client_set_lk_version_cbk] 0-ph-fs-0-client-1: Server lk version = 1
[2015-08-10 17:22:35.422462] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-ph-fs-0-replicate-0: Completed data selfheal on d0ff0185-f41d-4858-b80b-2624f42899df. source=0 sinks=1
[2015-08-10 17:29:13.086650] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-ph-fs-0-replicate-0: Completed data selfheal on d0ff0185-f41d-4858-b80b-2624f42899df. source=0 sinks=1
[2015-08-16 02:21:08.068853] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-ph-fs-0-replicate-0: Completed data selfheal on d0ff0185-f41d-4858-b80b-2624f42899df. source=0 sinks=1
[2015-08-16 02:23:18.591072] I [afr-self-heal-common.c:476:afr_log_selfheal] 0-ph-fs-0-replicate-0: Completed data selfheal on d0ff0185-f41d-4858-b80b-2624f42899df. source=0 sinks=1
[2015-08-16 09:32:27.702200] I [afr-self-heald.c:676:afr_shd_full_healer] 0-ph-fs-0-replicate-0: starting full sweep on subvol ph-fs-0-client-0
[2015-08-16 09:32:27.709516] I [afr-self-heald.c:686:afr_shd_full_healer] 0-ph-fs-0-replicate-0: finished full sweep on subvol ph-fs-0-client-0
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users