If you are trying this again, please 'gluster volume set $volname
client-log-level DEBUG`before attempting the add-brick and attach
the gvol0-add-brick-mount.log here. After that, you can change the
client-log-level back to INFO.
-Ravi
On 22/05/19 11:32 AM, Ravishankar N
wrote:
On 22/05/19 11:23 AM, David
Cunningham wrote:
Hi Ravi,
I'd already done exactly that before, where step 3 was a
simple 'rm -rf /nodirectwritedata/gluster/gvol0'. Have you
another suggestion on what the cleanup or reformat should
be?
`rm -rf /nodirectwritedata/gluster/gvol0` does look okay to me
David. Basically, '/nodirectwritedata/gluster/gvol0' must be empty
and must not have any extended attributes set on it. Why
fuse_first_lookup() is failing is a bit of a mystery to me at this
point. :-(
Regards,
Ravi
Hmm, so the volume info seems to indicate that the
add-brick was successful but the gfid xattr is missing
on the new brick (as are the actual files, barring the
.glusterfs folder, according to your previous mail).
Do you want to try removing and adding it again?
1. `gluster volume remove-brick gvol0 replica 2
gfs3:/nodirectwritedata/gluster/gvol0 force` from gfs1
2. Check that gluster volume info is now back to a 1x2
volume on all nodes and `gluster peer status` is
connected on all nodes.
3. Cleanup or reformat
'/nodirectwritedata/gluster/gvol0' on gfs3.
4. `gluster volume add-brick gvol0 replica 3 arbiter 1
gfs3:/nodirectwritedata/gluster/gvol0` from gfs1.
5. Check that the files are getting healed on to the
new brick.
Thanks,
Ravi
On
22/05/19 6:50 AM, David Cunningham wrote:
Hi Ravi,
Certainly. On the existing two nodes:
gfs1 # getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from
absolute path names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gvol0-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
gfs2 # getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from
absolute path names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000000
trusted.afr.gvol0-client-0=0x000000000000000000000000
trusted.afr.gvol0-client-2=0x000000000000000000000000
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
On the new node:
gfs3 # getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0
getfattr: Removing leading '/' from
absolute path names
# file: nodirectwritedata/gluster/gvol0
trusted.afr.dirty=0x000000000000000000000001
trusted.glusterfs.volume-id=0xfb5af69e1c3e41648b23c1d7bec9b1b6
Output of "gluster volume info" is the
same on all 3 nodes and is:
# gluster volume info
Volume Name: gvol0
Type: Replicate
Volume ID:
fb5af69e-1c3e-4164-8b23-c1d7bec9b1b6
Status: Started
Snapshot Count: 0
Number of Bricks: 1 x (2 + 1) = 3
Transport-type: tcp
Bricks:
Brick1:
gfs1:/nodirectwritedata/gluster/gvol0
Brick2:
gfs2:/nodirectwritedata/gluster/gvol0
Brick3:
gfs3:/nodirectwritedata/gluster/gvol0
(arbiter)
Options Reconfigured:
performance.client-io-threads: off
nfs.disable: on
transport.address-family: inet
Hi David,
Could you provide the `getfattr -d -m. -e hex
/nodirectwritedata/gluster/gvol0` output of all
bricks and the output of `gluster volume info`?
Thanks,
Ravi
On
22/05/19 4:57 AM, David Cunningham wrote:
Hi Sanju,
Here's what glusterd.log says
on the new arbiter server when
trying to add the node:
[2019-05-22 00:15:05.963059] I
[run.c:242:runner_log]
(-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0x3b2cd)
[0x7fe4ca9102cd]
-->/usr/lib64/glusterfs/5.6/xlator/mgmt/glusterd.so(+0xe6b85)
[0x7fe4ca9bbb85]
-->/lib64/libglusterfs.so.0(runner_log+0x115)
[0x7fe4d5ecc955] ) 0-management:
Ran script:
/var/lib/glusterd/hooks/1/add-brick/pre/S28Quota-enable-root-xattr-heal.sh
--volname=gvol0 --version=1
--volume-op=add-brick
--gd-workdir=/var/lib/glusterd
[2019-05-22 00:15:05.963177] I
[MSGID: 106578]
[glusterd-brick-ops.c:1355:glusterd_op_perform_add_bricks]
0-management: replica-count is set
3
[2019-05-22 00:15:05.963228] I
[MSGID: 106578]
[glusterd-brick-ops.c:1360:glusterd_op_perform_add_bricks]
0-management: arbiter-count is set
1
[2019-05-22 00:15:05.963257] I
[MSGID: 106578]
[glusterd-brick-ops.c:1364:glusterd_op_perform_add_bricks]
0-management: type is set 0, need
to change it
[2019-05-22 00:15:17.015268] E
[MSGID: 106053]
[glusterd-utils.c:13942:glusterd_handle_replicate_brick_ops]
0-management: Failed to set
extended attribute
trusted.add-brick : Transport
endpoint is not connected
[Transport endpoint is not
connected]
[2019-05-22 00:15:17.036479] E
[MSGID: 106073]
[glusterd-brick-ops.c:2595:glusterd_op_add_brick]
0-glusterd: Unable to add bricks
[2019-05-22 00:15:17.036595] E
[MSGID: 106122]
[glusterd-mgmt.c:299:gd_mgmt_v3_commit_fn]
0-management: Add-brick commit
failed.
[2019-05-22 00:15:17.036710] E
[MSGID: 106122]
[glusterd-mgmt-handler.c:594:glusterd_handle_commit_fn]
0-management: commit failed on
operation Add brick
As before
gvol0-add-brick-mount.log said:
[2019-05-22 00:15:17.005695] I
[fuse-bridge.c:4267:fuse_init]
0-glusterfs-fuse: FUSE inited with
protocol versions: glusterfs 7.24
kernel 7.22
[2019-05-22 00:15:17.005749] I
[fuse-bridge.c:4878:fuse_graph_sync]
0-fuse: switched to graph 0
[2019-05-22 00:15:17.010101] E
[fuse-bridge.c:4336:fuse_first_lookup]
0-fuse: first lookup on root
failed (Transport endpoint is not
connected)
[2019-05-22 00:15:17.014217] W
[fuse-bridge.c:897:fuse_attr_cbk]
0-glusterfs-fuse: 2: LOOKUP() /
=> -1 (Transport endpoint is
not connected)
[2019-05-22 00:15:17.015097] W
[fuse-resolve.c:127:fuse_resolve_gfid_cbk]
0-fuse:
00000000-0000-0000-0000-000000000001:
failed to resolve (Transport
endpoint is not connected)
[2019-05-22 00:15:17.015158] W
[fuse-bridge.c:3294:fuse_setxattr_resume]
0-glusterfs-fuse: 3: SETXATTR
00000000-0000-0000-0000-000000000001/1
(trusted.add-brick) resolution
failed
[2019-05-22 00:15:17.035636] I
[fuse-bridge.c:5144:fuse_thread_proc]
0-fuse: initating unmount of
/tmp/mntYGNbj9
[2019-05-22 00:15:17.035854] W
[glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5) [0x7f7745ccedd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x55c81b63de75]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b)
[0x55c81b63dceb] ) 0-: received
signum (15), shutting down
[2019-05-22 00:15:17.035942] I
[fuse-bridge.c:5914:fini] 0-fuse:
Unmounting '/tmp/mntYGNbj9'.
[2019-05-22 00:15:17.035966] I
[fuse-bridge.c:5919:fini] 0-fuse:
Closing fuse connection to
'/tmp/mntYGNbj9'.
Here are the processes running
on the new arbiter server:
# ps -ef | grep gluster
root 3466 1 0 20:13
? 00:00:00
/usr/sbin/glusterfs -s localhost
--volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l
/var/log/glusterfs/glustershd.log
-S
/var/run/gluster/24c12b09f93eec8e.socket
--xlator-option
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
--process-name glustershd
root 6832 1 0 May16
? 00:02:10
/usr/sbin/glusterd -p
/var/run/glusterd.pid --log-level
INFO
root 17841 1 0 May16
? 00:00:58
/usr/sbin/glusterfs --process-name
fuse --volfile-server=gfs1
--volfile-id=/gvol0 /mnt/glusterfs
Here are the files created on
the new arbiter server:
# find
/nodirectwritedata/gluster/gvol0 |
xargs ls -ald
drwxr-xr-x 3 root root 4096 May 21
20:15
/nodirectwritedata/gluster/gvol0
drw------- 2 root root 4096 May 21
20:15
/nodirectwritedata/gluster/gvol0/.glusterfs
Thank you for your help!
David,
can you please attach glusterd.logs?
As the error message says, Commit failed
on the arbitar node, we might be able to
find some issue on that node.
Hello,
We're adding
an arbiter node
to an existing
volume and
having an issue.
Can anyone help?
The root cause
error appears to
be
"00000000-0000-0000-0000-000000000001: failed to resolve (Transport
endpoint is not
connected)", as
below.
We are
running
glusterfs 5.6.1.
Thanks in
advance for any
assistance!
On existing
node gfs1,
trying to add
new arbiter node
gfs3:
# gluster
volume add-brick
gvol0 replica 3
arbiter 1
gfs3:/nodirectwritedata/gluster/gvol0
volume
add-brick:
failed: Commit
failed on gfs3.
Please check log
file for
details.
This looks like a glusterd
issue. Please check the glusterd
logs for more info.
Adding the glusterd dev to this
thread. Sanju, can you take a
look?
Regards,
Nithya
On new node
gfs3 in
gvol0-add-brick-mount.log:
[2019-05-17
01:20:22.689721]
I
[fuse-bridge.c:4267:fuse_init] 0-glusterfs-fuse: FUSE inited with
protocol
versions:
glusterfs 7.24
kernel 7.22
[2019-05-17
01:20:22.689778]
I
[fuse-bridge.c:4878:fuse_graph_sync] 0-fuse: switched to graph 0
[2019-05-17
01:20:22.694897]
E
[fuse-bridge.c:4336:fuse_first_lookup] 0-fuse: first lookup on root
failed
(Transport
endpoint is not
connected)
[2019-05-17
01:20:22.699770]
W
[fuse-resolve.c:127:fuse_resolve_gfid_cbk] 0-fuse:
00000000-0000-0000-0000-000000000001:
failed to
resolve
(Transport
endpoint is not
connected)
[2019-05-17
01:20:22.699834]
W
[fuse-bridge.c:3294:fuse_setxattr_resume] 0-glusterfs-fuse: 2: SETXATTR
00000000-0000-0000-0000-000000000001/1 (trusted.add-brick) resolution
failed
[2019-05-17
01:20:22.715656]
I
[fuse-bridge.c:5144:fuse_thread_proc] 0-fuse: initating unmount of
/tmp/mntQAtu3f
[2019-05-17
01:20:22.715865]
W
[glusterfsd.c:1500:cleanup_and_exit]
(-->/lib64/libpthread.so.0(+0x7dd5) [0x7fb223bf6dd5]
-->/usr/sbin/glusterfs(glusterfs_sigwaiter+0xe5)
[0x560886581e75]
-->/usr/sbin/glusterfs(cleanup_and_exit+0x6b) [0x560886581ceb] ) 0-:
received signum
(15), shutting
down
[2019-05-17
01:20:22.715926]
I
[fuse-bridge.c:5914:fini] 0-fuse: Unmounting '/tmp/mntQAtu3f'.
[2019-05-17
01:20:22.715953]
I
[fuse-bridge.c:5919:fini] 0-fuse: Closing fuse connection to
'/tmp/mntQAtu3f'.
Processes
running on new
node gfs3:
# ps -ef |
grep gluster
root
6832 1 0
20:17 ?
00:00:00
/usr/sbin/glusterd
-p
/var/run/glusterd.pid
--log-level INFO
root
15799 1 0
20:17 ?
00:00:00
/usr/sbin/glusterfs -s localhost --volfile-id gluster/glustershd -p
/var/run/gluster/glustershd/glustershd.pid -l
/var/log/glusterfs/glustershd.log
-S
/var/run/gluster/24c12b09f93eec8e.socket
--xlator-option
*replicate*.node-uuid=2069cfb3-c798-47e3-8cf8-3c584cf7c412
--process-name
glustershd
root 16856
16735 0 21:21
pts/0
00:00:00 grep
--color=auto
gluster
--
David
Cunningham,
Voisonics
Limited
http://voisonics.com/
USA: +1 213
221 1092
New Zealand:
+64 (0)28 2558
3782
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
--
--
David Cunningham,
Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28
2558 3782
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
https://lists.gluster.org/mailman/listinfo/gluster-users
--
David Cunningham, Voisonics
Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
--
David Cunningham, Voisonics Limited
http://voisonics.com/
USA: +1 213 221 1092
New Zealand: +64 (0)28 2558 3782
|