Hmm, so the volume info seems to indicate that the add-brick was
successful but the gfid xattr is missing on the new brick (as are
the actual files, barring the .glusterfs folder, according to your
previous mail).
Do you want to try removing and adding it again?
1. `gluster volume remove-brick gvol0 replica 2
gfs3:/nodirectwritedata/gluster/gvol0 force` from gfs1
2. Check that gluster volume info is now back to a 1x2 volume on
all nodes and `gluster peer status` is connected on all nodes.
3. Cleanup or reformat '/nodirectwritedata/gluster/gvol0' on
gfs3.
4. `gluster volume add-brick gvol0 replica 3 arbiter 1
gfs3:/nodirectwritedata/gluster/gvol0` from gfs1.
5. Check that the files are getting healed on to the new brick.
Thanks,
Ravi
On 22/05/19 6:50 AM, David Cunningham
wrote:
Hi Ravi,
Certainly. On the existing two nodes:
gfs1 # getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from absolute path
names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gvol0-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
gfs2 # getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from absolute path
names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gvol0-client-0=0x000000000000000000000000
trusted.afr.gvol0-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
On the new node:
gfs3 # getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from absolute path
names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000001
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
Output of "gluster volume info" is the same on
all 3 nodes and is:
# gluster volume info
Volume Name: gvol0
Type: Replicate
Volume ID: fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1: gfs1:/nodirectwritedata/gluster/gvol0
Brick2: gfs2:/nodirectwritedata/gluster/gvol0
Brick3: gfs3:/nodirectwritedata/gluster/gvol0
(arbiter)
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
Hi David,
Could you provide the `getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0` output of all bricks and
the output of `gluster volume info`?
Thanks,
Ravi
On
22/05/19 4:57 AM, David Cunningham wrote:
Hi Sanju,
Here's what glusterd.log says on the new
arbiter server when trying to add the node:
[2019-05-22 00:15:05.963059] I
[run.c:242:runner_log]
(-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
[0x7fe4ca9102cd]
-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
[0x7fe4ca9bbb85]
-->/lib64/libglusterfs.so.0(runner_log+0x115)
[0x7fe4d5ecc955] ) 0-management: Ran script:
/var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
--volname=gvol0 --version=1
--volume-op=add-brick
--gd-workdir=/var/lib/glusterd
[2019-05-22 00:15:05.963177] I [MSGID:
106578]
[glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks]
0-management: replica-count is set 3
[2019-05-22 00:15:05.963228] I [MSGID:
106578]
[glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks]
0-management: arbiter-count is set 1
[2019-05-22 00:15:05.963257] I [MSGID:
106578]
[glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks]
0-management: type is set 0, need to change
it
[2019-05-22 00:15:17.015268] E [MSGID:
106053]
[glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops]
0-management: Failed to set extended
attribute trusted.add-brick : Transport
endpoint is not connected [Transport
endpoint is not connected]
[2019-05-22 00:15:17.036479] E [MSGID:
106073]
[glusterd-brick-ops.c:2595:glusterd_op_add_brick]
0-glusterd: Unable to add bricks
[2019-05-22 00:15:17.036595] E [MSGID:
106122]
[glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn]
0-management: Add-brick commit failed.
[2019-05-22 00:15:17.036710] E [MSGID:
106122]
[glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn]
0-management: commit failed on operation Add
brick
As before gvol0-add-brick-mount.log said:
[2019-05-22 00:15:17.005695] I
[fuse-bridge.c:4267:fuse_init]
0-glusterfs-fuse: FUSE inited with protocol
versions: glusterfs 7.24 kernel 7.22
[2019-05-22 00:15:17.005749] I
[fuse-bridge.c:4878:fuse_graph_sync] 0-fuse:
switched to graph 0
[2019-05-22 00:15:17.010101] E
[fuse-bridge.c:4336:fuse_first_lookup]
0-fuse: first lookup on root failed
(Transport endpoint is not connected)
[2019-05-22 00:15:17.014217] W
[fuse-bridge.c:897:fuse_attr_cbk]
0-glusterfs-fuse: 2: LOOKUP() / => -1
(Transport endpoint is not connected)
[2019-05-22 00:15:17.015097] W
[fuse-resolve.c:127:fuse_resolve_gfid_cbk]
0-fuse:
00000000-0000-0000-0000-000000000001: failed
to resolve (Transport endpoint is not
connected)
[2019-05-22 00:15:17.015158] W
[fuse-bridge.c:3294:fuse_setxattr_resume]
0-glusterfs-fuse: 3: SETXATTR
00000000-0000-0000-0000-000000000001/1
(trusted.add-brick) resolution failed
[2019-05-22 00:15:17.035636] I
[fuse-bridge.c:5144:fuse_thread_proc]
0-fuse: initating unmount of /tmp/mntYGNbj9
[2019-05-22 00:15:17.035854] W
[glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5)
[0x7f7745ccedd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x55c81b63de75]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
[0x55c81b63dceb] ) 0-: received signum (15),
shutting down
[2019-05-22 00:15:17.035942] I
[fuse-bridge.c:5914:fini] 0-fuse: Unmounting
'/tmp/mntYGNbj9'.
[2019-05-22 00:15:17.035966] I
[fuse-bridge.c:5919:fini] 0-fuse: Closing
fuse connection to '/tmp/mntYGNbj9'.
Here are the processes running on the new
arbiter server:
# ps -ef | grep gluster
root 3466 1 0 20:13 ?
00:00:00 /usr/sbin/glusterfs -s localhost
--volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid
-l /var/log/glusterfs/glustershd.log -S
/var/run/gluster/24c12b09f93eec8e.socket
--xlator-option
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
--process-name glustershd
root 6832 1 0 May16 ?
00:02:10 /usr/sbin/glusterd -p
/var/run/glusterd.pid --log-level INFO
root 17841 1 0 May16 ?
00:00:58 /usr/sbin/glusterfs --process-name
fuse --volfile-server=gfs1
--volfile-id=/gvol0 /mnt/glusterfs
Here are the files created on the new
arbiter server:
# find /nodirectwritedata/gluster/gvol0 |
xargs ls -ald
drwxr-xr-x 3 root root 4096 May 21 20:15
/nodirectwritedata/gluster/gvol0
drw------- 2 root root 4096 May 21 20:15
/nodirectwritedata/gluster/gvol0/.glusterfs
Thank you for your help!
David,
can you please attach glusterd.logs? As the
error message says, Commit failed on the arbitar
node, we might be able to find some issue on that
node.
Hello,
We're adding an arbiter
node to an existing volume
and having an issue. Can
anyone help? The root
cause error appears to be
"00000000-0000-0000-0000-000000000001: failed to resolve (Transport
endpoint is not
connected)", as below.
We are running
glusterfs 5.6.1. Thanks in
advance for any
assistance!
On existing node gfs1,
trying to add new arbiter
node gfs3:
# gluster volume
add-brick gvol0 replica 3
arbiter 1
gfs3:/nodirectwritedata/gluster/gvol0
volume add-brick: failed:
Commit failed on gfs3.
Please check log file for
details.
This looks like a glusterd issue. Please
check the glusterd logs for more info.
Adding the glusterd dev to this thread.
Sanju, can you take a look?
Regards,
Nithya
On new node gfs3 in
gvol0-add-brick-mount.log:
[2019-05-17
01:20:22.689721] I
[fuse-bridge.c:4267:fuse_init]
0-glusterfs-fuse: FUSE
inited with protocol
versions: glusterfs 7.24
kernel 7.22
[2019-05-17
01:20:22.689778] I
[fuse-bridge.c:4878:fuse_graph_sync]
0-fuse: switched to graph
0
[2019-05-17
01:20:22.694897] E
[fuse-bridge.c:4336:fuse_first_lookup]
0-fuse: first lookup on
root failed (Transport
endpoint is not connected)
[2019-05-17
01:20:22.699770] W
[fuse-resolve.c:127:fuse_resolve_gfid_cbk]
0-fuse:
00000000-0000-0000-0000-000000000001:
failed to resolve
(Transport endpoint is not
connected)
[2019-05-17
01:20:22.699834] W
[fuse-bridge.c:3294:fuse_setxattr_resume]
0-glusterfs-fuse: 2:
SETXATTR
00000000-0000-0000-0000-000000000001/1
(trusted.add-brick)
resolution failed
[2019-05-17
01:20:22.715656] I
[fuse-bridge.c:5144:fuse_thread_proc]
0-fuse: initating unmount
of /tmp/mntQAtu3f
[2019-05-17
01:20:22.715865] W
[glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5) [0x7fb223bf6dd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x560886581e75]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
[0x560886581ceb] ) 0-:
received signum (15),
shutting down
[2019-05-17
01:20:22.715926] I
[fuse-bridge.c:5914:fini]
0-fuse: Unmounting
'/tmp/mntQAtu3f'.
[2019-05-17
01:20:22.715953] I
[fuse-bridge.c:5919:fini]
0-fuse: Closing fuse
connection to
'/tmp/mntQAtu3f'.
Processes running on
new node gfs3:
# ps -ef | grep gluster
root 6832 1 0
20:17 ? 00:00:00
/usr/sbin/glusterd -p
/var/run/glusterd.pid
--log-level INFO
root 15799 1 0
20:17 ? 00:00:00
/usr/sbin/glusterfs -s
localhost --volfile-id
gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l
/var/log/glusterfs/glustershd.log
-S
/var/run/gluster/24c12b09f93eec8e.socket
--xlator-option
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
--process-name glustershd
root 16856 16735 0
21:21 pts/0 00:00:00
grep --color=auto gluster
--
David
Cunningham,
Voisonics
Limited
http://voisonics.com/
USA: +1 213
221 1092
New Zealand:
+64 (0)28 2558
3782
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
--
--
David Cunningham, Voisonics
Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
|