try to restart glusterd on all the storage nodes. -Bishoy > On Oct 30, 2015, at 4:26 AM, Thomas Bätzler <t.baetzler@xxxxxxxxxx> wrote: > > Hi, > > can somebody help me with fixing our 8 node gluster please? > > Setup is as follows: > > root@glucfshead2:~# gluster volume info > > Volume Name: archive > Type: Distributed-Replicate > Volume ID: d888b302-2a35-4559-9bb0-4e182f49f9c6 > Status: Started > Number of Bricks: 4 x 2 = 8 > Transport-type: tcp > Bricks: > Brick1: glucfshead1:/data/glusterfs/archive/brick1 > Brick2: glucfshead5:/data/glusterfs/archive/brick1 > Brick3: glucfshead2:/data/glusterfs/archive/brick1 > Brick4: glucfshead6:/data/glusterfs/archive/brick1 > Brick5: glucfshead3:/data/glusterfs/archive/brick1 > Brick6: glucfshead7:/data/glusterfs/archive/brick1 > Brick7: glucfshead4:/data/glusterfs/archive/brick1 > Brick8: glucfshead8:/data/glusterfs/archive/brick1 > Options Reconfigured: > cluster.data-self-heal: off > cluster.entry-self-heal: off > cluster.metadata-self-heal: off > features.lock-heal: on > cluster.readdir-optimize: on > performance.flush-behind: off > performance.io-thread-count: 16 > features.quota: off > performance.quick-read: on > performance.stat-prefetch: off > performance.io-cache: on > performance.cache-refresh-timeout: 1 > nfs.disable: on > performance.cache-max-file-size: 200kb > performance.cache-size: 2GB > performance.write-behind-window-size: 4MB > performance.read-ahead: off > storage.linux-aio: off > diagnostics.brick-sys-log-level: WARNING > cluster.self-heal-daemon: off > > Volume Name: archive2 > Type: Distributed-Replicate > Volume ID: 0fe86e42-e67f-46d8-8ed0-d0e34f539d69 > Status: Started > Number of Bricks: 4 x 2 = 8 > Transport-type: tcp > Bricks: > Brick1: glucfshead1:/data/glusterfs/archive2/brick1 > Brick2: glucfshead5:/data/glusterfs/archive2/brick1 > Brick3: glucfshead2:/data/glusterfs/archive2/brick1 > Brick4: glucfshead6:/data/glusterfs/archive2/brick1 > Brick5: glucfshead3:/data/glusterfs/archive2/brick1 > Brick6: glucfshead7:/data/glusterfs/archive2/brick1 > Brick7: glucfshead4:/data/glusterfs/archive2/brick1 > Brick8: glucfshead8:/data/glusterfs/archive2/brick1 > Options Reconfigured: > cluster.metadata-self-heal: off > cluster.entry-self-heal: off > cluster.data-self-heal: off > diagnostics.count-fop-hits: on > diagnostics.latency-measurement: on > features.lock-heal: on > diagnostics.brick-sys-log-level: WARNING > storage.linux-aio: off > performance.read-ahead: off > performance.write-behind-window-size: 4MB > performance.cache-size: 2GB > performance.cache-max-file-size: 200kb > nfs.disable: on > performance.cache-refresh-timeout: 1 > performance.io-cache: on > performance.stat-prefetch: off > performance.quick-read: on > features.quota: off > performance.io-thread-count: 16 > performance.flush-behind: off > auth.allow: 172.16.15.* > cluster.readdir-optimize: on > cluster.self-heal-daemon: off > > Some time ago node, glucfshead1 broke down. After some fiddling it was > decided not to deal with that immediately because the gluster was in > production and a rebuild on 3.4 would basically render the gluster unusable. > > Recently it was felt that we needed to deal with the situation and we > hired some experts to deal with the problem. So we reinstalled the > broken node and gave it a new name/ip and upgraded all systems to 3.6.4. > > The plan was to probe the "new" node into the gluster and then do a > brick-replace on it. However that did not go as expected. > > The node that we removed is now listed as "Peer Rejected": > > root@glucfshead2:~# gluster peer status > Number of Peers: 7 > > Hostname: glucfshead1 > Uuid: 09ed9a29-c923-4dc5-957a-e0d3e8032daf > State: Peer Rejected (Disconnected) > > Hostname: glucfshead3 > Uuid: a17ae95d-4598-4cd7-9ae7-808af10fedb5 > State: Peer in Cluster (Connected) > > Hostname: glucfshead4 > Uuid: 8547dadd-96bf-45fe-b49d-bab8f995c928 > State: Peer in Cluster (Connected) > > Hostname: glucfshead5 > Uuid: 249da8ea-fda6-47ff-98e0-dbff99dcb3f2 > State: Peer in Cluster (Connected) > > Hostname: glucfshead6 > Uuid: a0229511-978c-4904-87ae-7e1b32ac2c72 > State: Peer in Cluster (Connected) > > Hostname: glucfshead7 > Uuid: 548ec75a-0131-4c92-aaa9-7c6ee7b47a63 > State: Peer in Cluster (Connected) > > Hostname: glucfshead8 > Uuid: 5e54cbc1-482c-460b-ac38-00c4b71c50b9 > State: Peer in Cluster (Connected) > > If I probe the replacement node (glucfshead9) it only ever shows up on > one of my running nodes and it's in state "Rejected Peer (Connected)". > > How can we fix this - preferably without losing data? > > TIA, > Thomas > > --- > Diese E-Mail wurde von Avast Antivirus-Software auf Viren geprüft. > https://www.avast.com/antivirus > > > _______________________________________________ > Gluster-users mailing list > Gluster-users@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-users _______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users