Additional info, there are warning / errors in the new brick:
[2016-12-08 15:37:05.053615] E [MSGID: 115056]
[server-rpc-fops.c:509:server_mkdir_cbk] 0-storage-server: 12636867:
MKDIR /dms (00000000-0000-0000-0000-000000000001/dms) ==> (Permission
denied) [Permission denied]
[2016-12-08 15:37:05.135607] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 12636895:
FSTAT -2 (e9481d78-9094-45a7-ac7e-e1feeb7055df) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.163610] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523605:
FSTAT -2 (2bb87992-5f24-44bd-ba7c-70c84510942b) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.163633] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523604:
FSTAT -2 (2bb87992-5f24-44bd-ba7c-70c84510942b) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.166590] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523619:
FSTAT -2 (616028b7-a2c2-40e3-998a-68329daf7b07) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.166659] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523620:
FSTAT -2 (616028b7-a2c2-40e3-998a-68329daf7b07) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.241276] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3451382:
FSTAT -2 (f00e597e-7ae4-4d3a-986e-bbeb6cc07339) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.268583] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523823:
FSTAT -2 (a8a343c1-512f-4ad1-a3db-de9fc8ed990c) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.268771] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523824:
FSTAT -2 (a8a343c1-512f-4ad1-a3db-de9fc8ed990c) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.302501] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523868:
FSTAT -2 (eb0c4500-f9ae-408a-85e6-6e67ec6466a9) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.302558] I [MSGID: 115081]
[server-rpc-fops.c:1280:server_fstat_cbk] 0-storage-server: 3523869:
FSTAT -2 (eb0c4500-f9ae-408a-85e6-6e67ec6466a9) ==> (No such file or
directory) [No such file or directory]
[2016-12-08 15:37:05.365428] E [MSGID: 115056]
[server-rpc-fops.c:509:server_mkdir_cbk] 0-storage-server: 12637038:
MKDIR /files (00000000-0000-0000-0000-000000000001/files) ==>
(Permission denied) [Permission denied]
[2016-12-08 15:37:05.414486] E [MSGID: 115056]
[server-rpc-fops.c:509:server_mkdir_cbk] 0-storage-server: 3451430:
MKDIR /files (00000000-0000-0000-0000-000000000001/files) ==>
(Permission denied) [Permission denied]
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx
Skype: milos.cuculovic.mdpi
On 08.12.2016 16:32, Miloš Čučulović - MDPI wrote:
1. No, atm the old server (storage2) volume is mounted on some other
servers, so all files are created there. If I check the new brick, there
is no files.
2. On storage2 server (old brick)
getfattr: Removing leading '/' from absolute path names
# file: data/data-cluster
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382
On storage server (new brick)
getfattr: Removing leading '/' from absolute path names
# file: data/data-cluster
trusted.gfid=0x00000000000000000000000000000001
trusted.glusterfs.dht=0x000000010000000000000000ffffffff
trusted.glusterfs.volume-id=0x0226135726f346bcb3f8cb73365ed382
3.
Thread 8 (Thread 0x7fad832dd700 (LWP 30057)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007fad88834f3e in __afr_shd_healer_wait () from
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so
#2 0x00007fad88834fad in afr_shd_healer_wait () from
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so
#3 0x00007fad88835aa0 in afr_shd_index_healer () from
/usr/lib/x86_64-linux-gnu/glusterfs/3.7.6/xlator/cluster/replicate.so
#4 0x00007fad8df4270a in start_thread (arg=0x7fad832dd700) at
pthread_create.c:333
#5 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 7 (Thread 0x7fad83ade700 (LWP 30056)):
#0 0x00007fad8dc78e23 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
#1 0x00007fad8e808a58 in ?? () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8df4270a in start_thread (arg=0x7fad83ade700) at
pthread_create.c:333
#3 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 6 (Thread 0x7fad894a5700 (LWP 30055)):
#0 0x00007fad8dc78e23 in epoll_wait () at
../sysdeps/unix/syscall-template.S:84
#1 0x00007fad8e808a58 in ?? () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8df4270a in start_thread (arg=0x7fad894a5700) at
pthread_create.c:333
#3 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 5 (Thread 0x7fad8a342700 (LWP 30054)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007fad8e7ecd98 in syncenv_task () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8e7ed970 in syncenv_processor () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#3 0x00007fad8df4270a in start_thread (arg=0x7fad8a342700) at
pthread_create.c:333
#4 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 4 (Thread 0x7fad8ab43700 (LWP 30053)):
#0 pthread_cond_timedwait@@GLIBC_2.3.2 () at
../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225
#1 0x00007fad8e7ecd98 in syncenv_task () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8e7ed970 in syncenv_processor () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#3 0x00007fad8df4270a in start_thread (arg=0x7fad8ab43700) at
pthread_create.c:333
#4 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 3 (Thread 0x7fad8b344700 (LWP 30052)):
#0 do_sigwait (sig=0x7fad8b343e3c, set=<optimized out>) at
../sysdeps/unix/sysv/linux/sigwait.c:64
#1 __sigwait (set=<optimized out>, sig=0x7fad8b343e3c) at
../sysdeps/unix/sysv/linux/sigwait.c:96
#2 0x00000000004080bf in glusterfs_sigwaiter ()
#3 0x00007fad8df4270a in start_thread (arg=0x7fad8b344700) at
pthread_create.c:333
#4 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 2 (Thread 0x7fad8bb45700 (LWP 30051)):
#0 0x00007fad8df4bc6d in nanosleep () at
../sysdeps/unix/syscall-template.S:84
#1 0x00007fad8e7ca744 in gf_timer_proc () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x00007fad8df4270a in start_thread (arg=0x7fad8bb45700) at
pthread_create.c:333
#3 0x00007fad8dc7882d in clone () at
../sysdeps/unix/sysv/linux/x86_64/clone.S:109
Thread 1 (Thread 0x7fad8ec66780 (LWP 30050)):
#0 0x00007fad8df439dd in pthread_join (threadid=140383309420288,
thread_return=0x0) at pthread_join.c:90
#1 0x00007fad8e808eeb in ?? () from
/usr/lib/x86_64-linux-gnu/libglusterfs.so.0
#2 0x0000000000405501 in main ()
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx
Skype: milos.cuculovic.mdpi
On 08.12.2016 16:17, Ravishankar N wrote:
On 12/08/2016 06:53 PM, Atin Mukherjee wrote:
On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
<cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>> wrote:
Ah, damn! I found the issue. On the storage server, the storage2
IP address was wrong, I inversed two digits in the /etc/hosts
file, sorry for that :(
I was able to add the brick now, I started the heal, but still no
data transfer visible.
1. Are the files getting created on the new brick though?
2. Can you provide the output of `getfattr -d -m . -e hex
/data/data-cluster` on both bricks?
3. Is it possible to attach gdb to the self-heal daemon on the original
(old) brick and get a backtrace?
`gdb -p <pid of self-heal daemon on the orignal brick>`
thread apply all bt -->share this output
quit gdb.
-Ravi
@Ravi/Pranith - can you help here?
By doing gluster volume status, I have
Status of volume: storage
Gluster process TCP Port RDMA Port
Online Pid
------------------------------------------------------------------------------
Brick storage2:/data/data-cluster 49152 0 Y
23101
Brick storage:/data/data-cluster 49152 0 Y
30773
Self-heal Daemon on localhost N/A N/A Y
30050
Self-heal Daemon on storage N/A N/A Y
30792
Any idea?
On storage I have:
Number of Peers: 1
Hostname: 195.65.194.217
Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
State: Peer in Cluster (Connected)
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
Skype: milos.cuculovic.mdpi
On 08.12.2016 13:55, Atin Mukherjee wrote:
Can you resend the attachment as zip? I am unable to extract the
content? We shouldn't have 0 info file. What does gluster peer
status
output say?
On Thu, Dec 8, 2016 at 4:51 PM, Miloš Čučulović - MDPI
<cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>> wrote:
I hope you received my last email Atin, thank you!
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
Skype: milos.cuculovic.mdpi
On 08.12.2016 10:28, Atin Mukherjee wrote:
---------- Forwarded message ----------
From: *Atin Mukherjee* <amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>> <mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>>>>
Date: Thu, Dec 8, 2016 at 11:56 AM
Subject: Re: Replica brick not working
To: Ravishankar N <ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>
<mailto:ravishankar@xxxxxxxxxx <mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>>>
Cc: Miloš Čučulović - MDPI <cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>>>,
Pranith Kumar Karampuri
<pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>
<mailto:pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>>
<mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx> <mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>>>>,
gluster-users
<gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>>>>
On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N
<ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx> <mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx> <mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>>>
wrote:
On 12/08/2016 10:43 AM, Atin Mukherjee wrote:
>From the log snippet:
[2016-12-07 09:15:35.677645] I [MSGID: 106482]
[glusterd-brick-ops.c:442:__glusterd_handle_add_brick]
0-management: Received add brick req
[2016-12-07 09:15:35.677708] I [MSGID: 106062]
[glusterd-brick-ops.c:494:__glusterd_handle_add_brick]
0-management: replica-count is 2
[2016-12-07 09:15:35.677735] E [MSGID: 106291]
[glusterd-brick-ops.c:614:__glusterd_handle_add_brick]
0-management:
The last log entry indicates that we hit the
code path in
gd_addbr_validate_replica_count ()
if (replica_count ==
volinfo->replica_count) {
if (!(total_bricks %
volinfo->dist_leaf_count)) {
ret = 1;
goto out;
}
}
It seems unlikely that this snippet was hit
because we print
the E
[MSGID: 106291] in the above message only if
ret==-1.
gd_addbr_validate_replica_count() returns -1 and
yet not
populates
err_str only when in volinfo->type doesn't match
any of the
known
volume types, so volinfo->type is corrupted perhaps?
You are right, I missed that ret is set to 1 here in
the above
snippet.
@Milos - Can you please provide us the volume info
file from
/var/lib/glusterd/vols/<volname>/ from all the three
nodes to
continue
the analysis?
-Ravi
@Pranith, Ravi - Milos was trying to convert a
dist (1 X 1)
volume to a replicate (1 X 2) using add brick
and hit
this issue
where add-brick failed. The cluster is
operating with 3.7.6.
Could you help on what scenario this code path
can be
hit? One
straight forward issue I see here is missing
err_str in
this path.
--
~ Atin (atinm)
--
~ Atin (atinm)
--
~ Atin (atinm)
--
~ Atin (atinm)
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users