Atin,
I was able to move forward a bit. Initially, I had this:
sudo gluster peer status
Number of Peers: 1
Hostname: storage2
Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad
State: Peer Rejected (Connected)
Then, on storage2 I removed all from /var/lib/glusterd except the info file.
Now I am getting another error message:
sudo gluster peer status
Number of Peers: 1
Hostname: storage2
Uuid: 32bef70a-9e31-403e-b9f3-ec9e1bd162ad
State: Sent and Received peer request (Connected)
But the add brick is still not working. I checked the hosts file and all
seems ok, ping is also working well.
The think I also need to know, when adding a new replicated brick, do I
need to first sync all files, or the new brick server needs to be empty?
Also, do I first need to create the same volume on the new server or
adding it to the volume of server1 will do it automatically?
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx
Skype: milos.cuculovic.mdpi
On 14.12.2016 05:13, Atin Mukherjee wrote:
Milos,
I just managed to take a look into a similar issue and my analysis is at
[1]. I remember you mentioning about some incorrect /etc/hosts entries
which lead to this same problem in earlier case, do you mind to recheck
the same?
[1]
http://www.gluster.org/pipermail/gluster-users/2016-December/029443.html
On Wed, Dec 14, 2016 at 2:57 AM, Miloš Čučulović - MDPI
<cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>> wrote:
Hi All,
Moving forward with my issue, sorry for the late reply!
I had some issues with the storage2 server (original volume), then
decided to use 3.9.0, si I have the latest version.
For that, I synced manually all the files to the storage server. I
installed there gluster 3.9.0, started it, created new volume called
storage and all seems to work ok.
Now, I need to create my replicated volume (add new brick on
storage2 server). Almost all the files are there. So, I was adding
on storage server:
* sudo gluter peer probe storage2
* sudo gluster volume add-brick storage replica 2
storage2:/data/data-cluster force
But there I am receiving "volume add-brick: failed: Host storage2 is
not in 'Peer in Cluster' state"
Any idea?
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
Skype: milos.cuculovic.mdpi
On 08.12.2016 17:52, Ravishankar N wrote:
On 12/08/2016 09:44 PM, Miloš Čučulović - MDPI wrote:
I was able to fix the sync by rsync-ing all the directories,
then the
hale started. The next problem :), as soon as there are
files on the
new brick, the gluster mount will render also this one for
mounts, and
the new brick is not ready yet, as the sync is not yet done,
so it
results on missing files on client side. I temporary removed
the new
brick, now I am running a manual rsync and will add the
brick again,
hope this could work.
What mechanism is managing this issue, I guess there is
something per
built to make a replica brick available only once the data is
completely synced.
This mechanism was introduced in 3.7.9 or 3.7.10
(http://review.gluster.org/#/c/13806/
<http://review.gluster.org/#/c/13806/>). Before that version, you
manually needed to set some xattrs on the bricks so that healing
could
happen in parallel while the client still would server reads
from the
original brick. I can't find the link to the doc which
describes these
steps for setting xattrs.:-(
Calling it a day,
Ravi
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
Skype: milos.cuculovic.mdpi
On 08.12.2016 16:17, Ravishankar N wrote:
On 12/08/2016 06:53 PM, Atin Mukherjee wrote:
On Thu, Dec 8, 2016 at 6:44 PM, Miloš Čučulović - MDPI
<cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>>> wrote:
Ah, damn! I found the issue. On the storage
server, the storage2
IP address was wrong, I inversed two digits in
the /etc/hosts
file, sorry for that :(
I was able to add the brick now, I started the
heal, but still no
data transfer visible.
1. Are the files getting created on the new brick though?
2. Can you provide the output of `getfattr -d -m . -e hex
/data/data-cluster` on both bricks?
3. Is it possible to attach gdb to the self-heal daemon
on the original
(old) brick and get a backtrace?
`gdb -p <pid of self-heal daemon on the orignal brick>`
thread apply all bt -->share this output
quit gdb.
-Ravi
@Ravi/Pranith - can you help here?
By doing gluster volume status, I have
Status of volume: storage
Gluster process TCP Port
RDMA Port
Online Pid
------------------------------------------------------------------------------
Brick storage2:/data/data-cluster 49152 0 Y
23101
Brick storage:/data/data-cluster 49152 0 Y
30773
Self-heal Daemon on localhost N/A
N/A Y
30050
Self-heal Daemon on storage N/A
N/A Y
30792
Any idea?
On storage I have:
Number of Peers: 1
Hostname: 195.65.194.217
Uuid: 7c988af2-9f76-4843-8e6f-d94866d57bb0
State: Peer in Cluster (Connected)
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel, Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
Skype: milos.cuculovic.mdpi
On 08.12.2016 13:55, Atin Mukherjee wrote:
Can you resend the attachment as zip? I am
unable to extract
the
content? We shouldn't have 0 info file. What
does gluster peer
status
output say?
On Thu, Dec 8, 2016 at 4:51 PM, Miloš
Čučulović - MDPI
<cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>>>> wrote:
I hope you received my last email Atin,
thank you!
- Kindest regards,
Milos Cuculovic
IT Manager
---
MDPI AG
Postfach, CH-4020 Basel, Switzerland
Office: St. Alban-Anlage 66, 4052 Basel,
Switzerland
Tel. +41 61 683 77 35
Fax +41 61 302 89 18
Email: cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>>
Skype: milos.cuculovic.mdpi
On 08.12.2016 10:28, Atin Mukherjee wrote:
---------- Forwarded message ----------
From: *Atin Mukherjee*
<amukherj@xxxxxxxxxx <mailto:amukherj@xxxxxxxxxx>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>>>
<mailto:amukherj@xxxxxxxxxx <mailto:amukherj@xxxxxxxxxx>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>
<mailto:amukherj@xxxxxxxxxx
<mailto:amukherj@xxxxxxxxxx>>>>>
Date: Thu, Dec 8, 2016 at 11:56 AM
Subject: Re: Replica
brick not working
To: Ravishankar N
<ravishankar@xxxxxxxxxx <mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>>>>
Cc: Miloš Čučulović - MDPI
<cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx <mailto:cuculovic@xxxxxxxx>>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>
<mailto:cuculovic@xxxxxxxx
<mailto:cuculovic@xxxxxxxx>>>>>,
Pranith Kumar Karampuri
<pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>
<mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>>
<mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>
<mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>>>
<mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>
<mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>>
<mailto:pkarampu@xxxxxxxxxx <mailto:pkarampu@xxxxxxxxxx>
<mailto:pkarampu@xxxxxxxxxx
<mailto:pkarampu@xxxxxxxxxx>>>>>,
gluster-users
<gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>>>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>
<mailto:gluster-users@xxxxxxxxxxx
<mailto:gluster-users@xxxxxxxxxxx>>>>>
On Thu, Dec 8, 2016 at 11:11 AM,
Ravishankar N
<ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>
<mailto:ravishankar@xxxxxxxxxx
<mailto:ravishankar@xxxxxxxxxx>>>>>
wrote:
On 12/08/2016 10:43 AM, Atin
Mukherjee wrote:
>From the log snippet:
[2016-12-07 09:15:35.677645]
I [MSGID: 106482]
[glusterd-brick-ops.c:442:__glusterd_handle_add_brick]
0-management: Received add
brick req
[2016-12-07 09:15:35.677708]
I [MSGID: 106062]
[glusterd-brick-ops.c:494:__glusterd_handle_add_brick]
0-management: replica-count is 2
[2016-12-07 09:15:35.677735]
E [MSGID: 106291]
[glusterd-brick-ops.c:614:__glusterd_handle_add_brick]
0-management:
The last log entry indicates
that we hit the
code path in
gd_addbr_validate_replica_count ()
if
(replica_count ==
volinfo->replica_count) {
if
(!(total_bricks %
volinfo->dist_leaf_count)) {
ret = 1;
goto out;
}
}
It seems unlikely that this
snippet was hit
because we print
the E
[MSGID: 106291] in the above
message only if
ret==-1.
gd_addbr_validate_replica_count() returns -1 and
yet not
populates
err_str only when in
volinfo->type doesn't match
any of the
known
volume types, so volinfo->type
is corrupted
perhaps?
You are right, I missed that ret is
set to 1 here in
the above
snippet.
@Milos - Can you please provide us
the volume info
file from
/var/lib/glusterd/vols/<volname>/
from all the three
nodes to
continue
the analysis?
-Ravi
@Pranith, Ravi - Milos was
trying to convert a
dist (1 X 1)
volume to a replicate (1 X
2) using add brick
and hit
this issue
where add-brick failed. The
cluster is
operating with 3.7.6.
Could you help on what
scenario this code path
can be
hit? One
straight forward issue I see
here is missing
err_str in
this path.
--
~ Atin (atinm)
--
~ Atin (atinm)
--
~ Atin (atinm)
--
~ Atin (atinm)
--
~ Atin (atinm)
_______________________________________________
Gluster-users mailing list
Gluster-users@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-users