On Thu, Dec 8, 2016 at 11:11 AM, Ravishankar N <ravishankar@xxxxxxxxxx> wrote:
On 12/08/2016 10:43 AM, Atin Mukherjee wrote:
>From the log snippet:
[2016-12-07 09:15:35.677645] I [MSGID: 106482] [glusterd-brick-ops.c:442:__glusterd_handle_add_brick] 0-management: Received add brick req
[2016-12-07 09:15:35.677708] I [MSGID: 106062] [glusterd-brick-ops.c:494:__glusterd_handle_add_brick] 0-management: replica-count is 2
[2016-12-07 09:15:35.677735] E [MSGID: 106291] [glusterd-brick-ops.c:614:__glusterd_handle_add_brick] 0-management:
The last log entry indicates that we hit the code path in gd_addbr_validate_replica_count ()
if (replica_count == volinfo->replica_count) {
if (!(total_bricks % volinfo->dist_leaf_count)) {
ret = 1;
goto out;
}
}
It seems unlikely that this snippet was hit because we print the E [MSGID: 106291] in the above message only if ret==-1.
gd_addbr_validate_replica_count() returns -1 and yet not populates err_str only when in volinfo->type doesn't match any of the known volume types, so volinfo->type is corrupted perhaps?
You are right, I missed that ret is set to 1 here in the above snippet.
@Milos - Can you please provide us the volume info file from /var/lib/glusterd/vols/<volname>/ from all the three nodes to continue the analysis?
-Ravi
@Pranith, Ravi - Milos was trying to convert a dist (1 X 1) volume to a replicate (1 X 2) using add brick and hit this issue where add-brick failed. The cluster is operating with 3.7.6. Could you help on what scenario this code path can be hit? One straight forward issue I see here is missing err_str in this path.
--
~ Atin (atinm)
_______________________________________________ Gluster-users mailing list Gluster-users@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-users