Re: replace-brick commit force fails in multi node cluster

Karthik Subrahmanya <ksubrahm@xxxxxxxxxx> · Wed, 28 Mar 2018 18:35:47 +0530

Hey Atin,

This is happening because of bringing down the glusterd on the third node before doing the replcae brick.
In replace brick we do a temporary mount to mark pending xattr on the source bricks saying that the brick being replaced is sink.
But in this case, since one of the source brick's glusterd is down, when the mount tries to get the port at which the brick is listening,
it fails to get that leading to failure of setting the "trusted.replace_brick" attribute.
For replica 3 volume to say any fop as success it needs at least quorum number of success. Hence the replace brick fails.

On the QE setup the replace brick would have succeeded only because of some race between glusterd going down and replace brick happening.
Otherwise there is no chance for replace brick to succeed.

Regards,
Karthik

On Tue, Mar 27, 2018 at 7:25 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
While writing a test for the patch fix of BZ https://bugzilla.redhat.com/show_bug.cgi?id=1560957 I just can't make my test case to pass where a replace brick commit force always fails on a multi node cluster and that's on the latest mainline code.

The fix is a one liner:

atin@dhcp35-96:~/codebase/upstream/glusterfs_master/glusterfs$ gd HEAD~1

diff --git a/xlators/mgmt/glusterd/src/glusterd-utils.c b/xlators/mgmt/glusterd/src/glusterd-utils.c
index af30756c9..24d813fbd 100644
--- a/xlators/mgmt/glusterd/src/glusterd-utils.c
+++ b/xlators/mgmt/glusterd/src/glusterd-utils.c
@@ -5995,6 +5995,7 @@ glusterd_brick_start (glusterd_volinfo_t *volinfo,
                          * TBD: re-use RPC connection across bricks
                          */
                         if (is_brick_mx_enabled ()) {
+                                brickinfo->port_registered = _gf_true;
                                 ret = glusterd_get_sock_from_brick_pid (pid, socketpath,
                                                                         sizeof(socketpath));
                                 if (ret) {




The test does the following:

#!/bin/bash                                                                        
                                                                                   
. $(dirname $0)/../../include.rc                                                   
. $(dirname $0)/../../cluster.rc                                                   
. $(dirname $0)/../../volume.rc                                                    
                                                                                   
                                                                                   
cleanup;                                                                           
                                                                                   
TEST launch_cluster 3;                                                             
                                                                                   
TEST $CLI_1 peer probe $H2;                                                        
EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count                                          
                                                                                   
TEST $CLI_1 peer probe $H3;                                                        
EXPECT_WITHIN $PROBE_TIMEOUT 2 peer_count                                          
                                                                                   
TEST $CLI_1 volume set all cluster.brick-multiplex on                              
                                                                                   
TEST $CLI_1 volume create $V0 replica 3 $H1:$B1/${V0}1 $H2:$B2/${V0}1 $H3:$B3/${V0}1 
                                                                                   
TEST $CLI_1 volume start $V0                                                       
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H1 $B1/${V0}1         
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H2 $B2/${V0}1         
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H3 $B3/${V0}1         
                                                                                   
                                                                                   
#bug-1560957 - replace brick followed by an add-brick in a brick mux setup         
#brings down one brick instance                                                    
                                                                                   
kill_glusterd 3                                                                    
EXPECT_WITHIN $PROBE_TIMEOUT 1 peer_count                                          
TEST $CLI_1 volume replace-brick $V0 $H1:$B1/${V0}1 $H1:$B1/${V0}1_new commit force 

this is where the test always fails saying "volume replace-brick: failed: Commit failed on localhost. Please check log file for details.
                                                                                   
TEST $glusterd_3                                                                   
EXPECT_WITHIN $PROBE_TIMEOUT 2 peer_count                                          
                                                                                   
TEST $CLI_1 volume add-brick $V0 replica 3 $H1:$$B1/${V0}3 $H2:$B1/${V0}3 $H3:$B1/${V0}3 commit force
                                                                                   
EXPECT_WITHIN $PROCESS_UP_TIMEOUT "1" brick_up_status_1 $V0 $H3 $H3:$B1/${V0}1  
cleanup;   

glusterd log from 1st node 
[2018-03-27 13:11:58.630845] E [MSGID: 106053] [glusterd-utils.c:13889:glusterd_handle_replicate_brick_ops] 0-management: Failed to set extended attribute trusted.replace-brick : Transport endpoint is not connected [Transport endpoint is not connected]

Request some help/attention from AFR folks.



_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel