Re: regression failed : snapshot/bug-1316437.t

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On Mon, Jul 25, 2016 at 4:34 PM, Avra Sengupta <asengupt@xxxxxxxxxx> wrote:
The crux of the problem is that as of today, brick processes on restart try to reuse the old port they were using (assuming that no other process will be using it, and not consulting pmap_registry_alloc() before using it). With a recent change, pmap_registry_alloc (), reassigns older ports that were used, but are now free. Hence snapd now gets a port that was previously used by a brick and tries to bind to it, whereas the older brick process without consulting pmap table blindly tries to connect to it, and hence we see this problem.

Now coming to the fix, I feel brick process should not try to get the older port and should just take a new port every time it comes up. We will not run out of ports with this change coz, now pmap allocates old ports again, and the previous port being used by the brick process will eventually be reused. If anyone sees any concern with this approach, please feel free to raise so now.

Looks to be OK, but I'll think through it and get back to you by a day or two if I have any objections.


While awaiting feedback from you guys, I have sent this patch (http://review.gluster.org/15001), which moves the said test case to bad tests for now, and after we collectively reach to a conclusion on the fix, we will remove this from bad test.

Regards,
Avra


On 07/25/2016 02:33 PM, Avra Sengupta wrote:
The failure suggests that the port snapd is trying to bind to is already in use. But snapd has been modified to use a new port everytime. I am looking into this.

On 07/25/2016 02:23 PM, Nithya Balachandran wrote:
More failures:

I see these messages in the snapd.log:

[2016-07-22 05:31:52.482282] I [rpcsvc.c:2199:rpcsvc_set_outstanding_rpc_limit] 0-rpc-service: Configured rpc.outstanding-rpc-limit with value 64
[2016-07-22 05:31:52.482352] W [MSGID: 101002] [options.c:954:xl_opt_validate] 0-patchy-server: option 'listen-port' is deprecated, preferred is 'transport.socket.listen-port', continuing with correction
[2016-07-22 05:31:52.482436] E [socket.c:771:__socket_server_bind] 0-tcp.patchy-server: binding to  failed: Address already in use
[2016-07-22 05:31:52.482447] E [socket.c:774:__socket_server_bind] 0-tcp.patchy-server: Port is already in use
[2016-07-22 05:31:52.482459] W [rpcsvc.c:1630:rpcsvc_create_listener] 0-rpc-service: listening on transport failed
[2016-07-22 05:31:52.482469] W [MSGID: 115045] [server.c:1061:init] 0-patchy-server: creation of listener failed
[2016-07-22 05:31:52.482481] E [MSGID: 101019] [xlator.c:433:xlator_init] 0-patchy-server: Initialization of volume 'patchy-server' failed, review your volfile again
[2016-07-22 05:31:52.482491] E [MSGID: 101066] [graph.c:324:glusterfs_graph_init] 0-patchy-server: initializing translator failed
[2016-07-22 05:31:52.482499] E [MSGID: 101176] [graph.c:670:glusterfs_graph_activate] 0-graph: init failed

On Mon, Jul 25, 2016 at 12:00 PM, Ashish Pandey <aspandey@xxxxxxxxxx> wrote:
Hi,

Following test has failed 3 times in last two days -

./tests/bugs/snapshot/bug-1316437.t

Please take a look at it and check if it spurious failure or not.

Ashish

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel






--

--Atin
_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux