On 2014-07-24 08:21, Joseph Fernandes wrote: > Hi All, > > After further investigation we have the root cause for this issue. > The root cause is the way in which a new node is added to the cluster. > > Now we have N1(127.1.1.1) and N2(127.1.1.2) as two nodes in the cluster, each having a brick N1:B1 (127.1.1.1 : 49146) and N2:B2 (127.1.1.2 : 49147) > > Now lets peer probe N3(127.1.1.3) from N1 > > 1) Friend request is sent from N1 to N3. N3 added N1 in the peerinfo list i.e N1 and its uuid say [UUID1] > 2) N3 get the brick infos from N1 > 3) N3 tries to start the bricks > 1) N3 tries to start the brick B1 and find its not a local brick, using the logic MY_UUID == brickinfo->uuid, which is false in this case, > as the UUID of brickinfo->hostname (N1) is [UUID1] (as suggested by the peerinfo list) and MY_UUID is [UUID3]. Hence doesn't start it. > 2) N3 tries to start the brick B2. Now the problem lies here. N3 uses glusterd_resolve_brick() to resolve the UUID of B2->hostname(N2). > In glusterd_resolve_brick(), it cannot find N2 in the peerinfo list. Then it checks if N2 is a local loop back address. Since N2(127.1.1.2) starts with > "127" it decides that its a local loop back address. Thus glusterd_resolve_brick() fills brickinfo->uuid with [UUID3]. Now as brickinfo->uuid == MY_UUID is > true, N3 initiates the brick process B2 with -s 127.1.1.2 and *-posix.glusterd-uuid=[UUID3]. This process dies off immediately, But for a short amount of > time it holds on to the --brick-port, say for example 49155 > > All the above is observed & inferred from glusterd logs from N3 (by adding some extra debug messages) > > Now coming back to our test case, i.e firing snapshot create and peer probe together. If N2 has assigned 49155 as the port --brick-port for the snapshot brick, then it finds that 49155 is Already acquired by some other process(i.e faulty brick process N3:B2 (127.1.1.2 : 49155), which as the -s 127.1.1.2 and *-posix.glusterd-uuid=[UUID3]) and hence fails to start the snapshot brick process. > > 1) The error is spurious, as its all about chance when N2 and N3 use the same port for their brick processes. > 2) This issue is possible only in a regression test scenario, As all the nodes are on the same machine, differentiated only by a different loop back address (127.1.1.*). > 3) Plus The logic that "127" is a local loop back address is also not wrong as glusterd's are suppose to run on different machines in real usage cases. > > Please do share your thoughts on this, And what would be a possible fix. Possible solutions (many/all of them probably breaks important assumptions): * Use some alias address range instead of 127.*.*.* for testing purposes * Stop treating localhost as special * Adopt the systemd LISTEN_FDS approach and have a special program that tries to bind to ports and then hands the port over to the proper daemon /Anders -- Anders Blomdell Email: anders.blomdell@xxxxxxxxxxxxxx Department of Automatic Control Lund University Phone: +46 46 222 4625 P.O. Box 118 Fax: +46 46 138118 SE-221 00 Lund, Sweden _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel