The crux of the problem is that as of
today, brick processes on restart try to reuse the old port they
were using (assuming that no other process will be using it, and
not consulting pmap_registry_alloc() before using it). With a
recent change, pmap_registry_alloc (), reassigns older ports that
were used, but are now free. Hence snapd now gets a port that was
previously used by a brick and tries to bind to it, whereas the
older brick process without consulting pmap table blindly tries to
connect to it, and hence we see this problem.
Now coming to the fix, I feel brick process should not try to get the older port and should just take a new port every time it comes up. We will not run out of ports with this change coz, now pmap allocates old ports again, and the previous port being used by the brick process will eventually be reused. If anyone sees any concern with this approach, please feel free to raise so now. While awaiting feedback from you guys, I have sent this patch (http://review.gluster.org/15001), which moves the said test case to bad tests for now, and after we collectively reach to a conclusion on the fix, we will remove this from bad test. Regards, Avra On 07/25/2016 02:33 PM, Avra Sengupta wrote:
|
_______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel