On Tue, Oct 24, 2017 at 11:13 PM, Alastair Neil <ajneil.tech@xxxxxxxxx> wrote:
peculiar behaviour. If I kill the glusterfs brick daemon and restart glusterd then the brick becomes available - but one of my other volumes bricks on the same server goes down in the same way it's like wack-a-mole.gluster version 3.10.6, replica 3 volume, daemon is present but does not appear to be functioningany ideas?
The subject and the data looks to be contradictory to me. Brick log (what you shared) doesn't have a cleanup_and_exit () trigger for a shutdown. Are you sure brick is down? OTOH, I see a mismatch of port for brick7/digitalcorpora where the brick process has 49154 but gluster volume status shows 49152. There is an issue with stale port which we're trying to address through https://review.gluster.org/18541 . But could you specify what exactly the problem is? Is it the stale port or the conflict between volume status output and actual brick health? If it's the latter, I'd need further information like output of "gluster get-state" command from the same node.
[root@gluster-2 bricks]# glv status digitalcorporaStatus of volume: digitalcorpora
Gluster processTCP Port RDMA Port Online Pid
------------------------------------------------------------ ------------------
Brick gluster-2:/export/brick7/digitalcorpo
ra49156 0 Y 125708
Brick gluster1.vsnet.gmu.edu:/export/brick7
/digitalcorpora49152 0 Y 12345
Brick gluster0:/export/brick7/digitalcorpor
a49152 0 Y 16098
Self-heal Daemon on localhost N/A N/A Y 126625
Self-heal Daemon on gluster1 N/A N/A Y 15405
Self-heal Daemon on gluster0 N/A N/A Y 18584
Task Status of Volume digitalcorpora
------------------------------------------------------------ ------------------
There are no active volume tasks
[root@gluster-2 bricks]# glv heal digitalcorpora info
Brick gluster-2:/export/brick7/digitalcorpora
Status: Transport endpoint is not connected
Number of entries: -
Brick gluster1.vsnet.gmu.edu:/export/brick7/digitalcorpora
/.trashcan
/DigitalCorpora/hello2.txt
/DigitalCorpora
Status: Connected
Number of entries: 3
Brick gluster0:/export/brick7/digitalcorpora
/.trashcan
/DigitalCorpora/hello2.txt
/DigitalCorpora
Status: Connected
Number of entries: 3
[2017-10-24 17:18:48.288505] W [glusterfsd.c:1360:cleanup_and_exit] (-->/lib64/libpthread.so.0(+ 0x7e25) [0x7f6f83c9de25] -->/usr/sbin/glusterfsd( glusterfs_sigwaiter+0xe5) [0x55a148eeb135] -->/usr/sbin/glusterfsd( cleanup_and_exit+0x6b) [0x55a148eeaf5b] ) 0-: received signum (15), shutting down
[2017-10-24 17:18:59.270384] I [MSGID: 100030] [glusterfsd.c:2503:main] 0-/usr/sbin/glusterfsd: Started running /usr/sbin/glusterfsd version 3.10.6 (args: /usr/sbin/glusterfsd -s gluster-2 --volfile-id digitalcorpora.gluster-2.export-brick7-digitalcorpora -p /var/lib/glusterd/vols/ digitalcorpora/run/gluster-2- export-brick7-digitalcorpora. pid -S /var/run/gluster/ f8e0b3393e47dc51a07c6609f9b408 41.socket --brick-name /export/brick7/digitalcorpora -l /var/log/glusterfs/bricks/ export-brick7-digitalcorpora. log --xlator-option *-posix.glusterd-uuid= 032c17f5-8cc9-445f-aa45- 897b5a066b43 --brick-port 49154 --xlator-option digitalcorpora-server.listen- port=49154)
[2017-10-24 17:18:59.285279] I [MSGID: 101190] [event-epoll.c:629:event_dispatch_epoll_worker] 0-epoll: Started thread with index 1
[2017-10-24 17:19:04.611723] I [rpcsvc.c:2237:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2017-10-24 17:19:04.611815] W [MSGID: 101002] [options.c:954:xl_opt_validate] 0-digitalcorpora-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port' , continuing with correction
[2017-10-24 17:19:04.615974] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option 'rpc-auth.auth-glusterfs' is not recognized
[2017-10-24 17:19:04.616033] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option 'rpc-auth.auth-unix' is not recognized
[2017-10-24 17:19:04.616070] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option 'rpc-auth.auth-null' is not recognized
[2017-10-24 17:19:04.616134] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option 'auth-path' is not recognized
[2017-10-24 17:19:04.616177] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-server: option 'ping-timeout' is not recognized
[2017-10-24 17:19:04.616203] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/export/brick7/ digitalcorpora: option 'rpc-auth-allow-insecure' is not recognized
[2017-10-24 17:19:04.616215] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/export/brick7/ digitalcorpora: option 'auth.addr./export/brick7/ digitalcorpora.allow' is not recognized
[2017-10-24 17:19:04.616226] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/export/brick7/ digitalcorpora: option 'auth-path' is not recognized
[2017-10-24 17:19:04.616237] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/export/brick7/ digitalcorpora: option 'auth.login.b17f2513-7d9c- 4174-a0c5-de4a752d46ca. password' is not recognized
[2017-10-24 17:19:04.616248] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-/export/brick7/ digitalcorpora: option 'auth.login./export/brick7/ digitalcorpora.allow' is not recognized
[2017-10-24 17:19:04.616283] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-quota: option 'timeout' is not recognized
[2017-10-24 17:19:04.616367] W [MSGID: 101174] [graph.c:361:_log_if_unknown_option] 0-digitalcorpora-trash: option 'brick-path' is not recognized
Final graph:
+----------------------------------------------------------- -------------------+
1: volume digitalcorpora-posix
2: type storage/posix
3: option glusterd-uuid 032c17f5-8cc9-445f-aa45-897b5a066b43
4: option directory /export/brick7/digitalcorpora
5: option volume-id 61efe58a-ae5b-4d8b-b9f9-67829867c442
6: option brick-uid 36
7: option brick-gid 36
8: end-volume
9:
10: volume digitalcorpora-trash
11: type features/trash
12: option trash-dir .trashcan
13: option brick-path /export/brick7/digitalcorpora
14: option trash-internal-op off
15: subvolumes digitalcorpora-posix
16: end-volume
17:
18: volume digitalcorpora-changetimerecorder
19: type features/changetimerecorder
20: option db-type sqlite3
21: option hot-brick off
22: option db-name digitalcorpora.db
23: option db-path /export/brick7/digitalcorpora/.glusterfs/
24: option record-exit off
25: option ctr_link_consistency off
26: option ctr_lookupheal_link_timeout 300
27: option ctr_lookupheal_inode_timeout 300
28: option record-entry on
29: option ctr-enabled off
30: option record-counters off
31: option ctr-record-metadata-heat off
32: option sql-db-cachesize 12500
33: option sql-db-wal-autocheckpoint 25000
34: subvolumes digitalcorpora-trash
35: end-volume
36:
37: volume digitalcorpora-changelog
38: type features/changelog
39: option changelog-brick /export/brick7/digitalcorpora
40: option changelog-dir /export/brick7/digitalcorpora/.glusterfs/changelogs
41: option changelog-barrier-timeout 120
42: subvolumes digitalcorpora-changetimerecorder
43: end-volume
44:
45: volume digitalcorpora-bitrot-stub
46: type features/bitrot-stub
47: option export /export/brick7/digitalcorpora
48: subvolumes digitalcorpora-changelog
49: end-volume
50:
51: volume digitalcorpora-access-control
52: type features/access-control
53: subvolumes digitalcorpora-bitrot-stub
54: end-volume
55:
56: volume digitalcorpora-locks
57: type features/locks
58: subvolumes digitalcorpora-access-control
59: end-volume
60:
61: volume digitalcorpora-worm
62: type features/worm
63: option worm off
64: option worm-file-level off
65: subvolumes digitalcorpora-locks
66: end-volume
67:
68: volume digitalcorpora-read-only
69: type features/read-only
70: option read-only off
71: subvolumes digitalcorpora-worm
72: end-volume
73:
74: volume digitalcorpora-leases
75: type features/leases
76: option leases off
77: subvolumes digitalcorpora-read-only
78: end-volume
79:
80: volume digitalcorpora-upcall
81: type features/upcall
82: option cache-invalidation off
83: subvolumes digitalcorpora-leases
84: end-volume
85:
86: volume digitalcorpora-io-threads
87: type performance/io-threads
88: subvolumes digitalcorpora-upcall
89: end-volume
90:
91: volume digitalcorpora-marker
92: type features/marker
93: option volume-uuid 61efe58a-ae5b-4d8b-b9f9-67829867c442
94: option timestamp-file /var/lib/glusterd/vols/digitalcorpora/marker.tstamp
95: option quota-version 0
96: option xtime off
97: option gsync-force-xtime off
98: option quota off
99: option inode-quota off
100: subvolumes digitalcorpora-io-threads
101: end-volume
102:
103: volume digitalcorpora-barrier
104: type features/barrier
105: option barrier disable
106: option barrier-timeout 120
107: subvolumes digitalcorpora-marker
108: end-volume
109:
110: volume digitalcorpora-index
111: type features/index
112: option index-base /export/brick7/digitalcorpora/.glusterfs/indices
113: option xattrop-dirty-watchlist trusted.afr.dirty
114: option xattrop-pending-watchlist trusted.afr.digitalcorpora-
115: subvolumes digitalcorpora-barrier
116: end-volume
117:
118: volume digitalcorpora-quota
119: type features/quota
120: option volume-uuid digitalcorpora
121: option server-quota off
122: option timeout 0
123: option deem-statfs off
124: subvolumes digitalcorpora-index
125: end-volume
126:
127: volume digitalcorpora-io-stats
128: type debug/io-stats
129: option unique-id /export/brick7/digitalcorpora
130: option log-level WARNING
131: option latency-measurement off
132: option count-fop-hits off
133: subvolumes digitalcorpora-quota
134: end-volume
135:
136: volume /export/brick7/digitalcorpora
137: type performance/decompounder
138: option rpc-auth-allow-insecure on
139: option auth.addr./export/brick7/digitalcorpora.allow 129.174.125.204,129.174.93.204
140: option auth-path /export/brick7/digitalcorpora
141: option auth.login.b17f2513-7d9c-4174-a0c5-de4a752d46ca.password 6c007ad0-b5a2-4564-8464- 300f8317e5c7
142: option auth.login./export/brick7/digitalcorpora.allow b17f2513-7d9c-4174-a0c5- de4a752d46ca
143: subvolumes digitalcorpora-io-stats
144: end-volume
145:
146: volume digitalcorpora-server
147: type protocol/server
148: option transport.socket.listen-port 49154
149: option rpc-auth.auth-glusterfs on
150: option rpc-auth.auth-unix on
151: option rpc-auth.auth-null on
152: option transport-type tcp
153: option transport.address-family inet
154: option auth.login./export/brick7/digitalcorpora.allow b17f2513-7d9c-4174-a0c5- de4a752d46ca
155: option auth.login.b17f2513-7d9c-4174-a0c5-de4a752d46ca.password 6c007ad0-b5a2-4564-8464- 300f8317e5c7
156: option auth-path /export/brick7/digitalcorpora
157: option auth.addr./export/brick7/digitalcorpora.allow 129.174.125.204,129.174.93.204
158: option ping-timeout 42
159: option transport.socket.keepalive 1
160: option rpc-auth-allow-insecure on
161: option transport.tcp-user-timeout 0
162: option transport.socket.keepalive-time 20
163: option transport.socket.keepalive-interval 2
164: option transport.socket.keepalive-count 9
165: subvolumes /export/brick7/digitalcorpora
166: end-volume
167:
+----------------------------------------------------------- -------------------+
[2017-10-24 17:22:21.438620] W [socket.c:593:__socket_rwv] 0-glusterfs: readv on 129.174.126.87:24007 failed (No data available)
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-users
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://lists.gluster.org/mailman/listinfo/gluster-users