Re: [Gluster-Maintainers] Another regression in release-3.7 and master

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 07, 2016 at 07:24:05PM +0530, Kaushal M wrote:
> On Thu, Apr 7, 2016 at 6:23 PM, Kaushal M <kshlmster@xxxxxxxxx> wrote:
> > On Thu, Apr 7, 2016 at 6:00 PM, Atin Mukherjee <amukherj@xxxxxxxxxx> wrote:
> >>
> >>
> >> On 04/07/2016 05:37 PM, Kaushal M wrote:
> >>>
> >>> On 7 Apr 2016 5:36 p.m., "Niels de Vos" <ndevos@xxxxxxxxxx
> >>> <mailto:ndevos@xxxxxxxxxx>> wrote:
> >>>>
> >>>> On Thu, Apr 07, 2016 at 05:13:54PM +0530, Kaushal M wrote:
> >>>> > On Thu, Apr 7, 2016 at 5:11 PM, Kaushal M <kshlmster@xxxxxxxxx
> >>> <mailto:kshlmster@xxxxxxxxx>> wrote:
> >>>> > > We've hit another regression.
> >>>> > >
> >>>> > > With management encryption enabled, daemons like NFS and SHD don't
> >>>> > > start on the current heads of release-3.7 and master branches.
> >>>> > >
> >>>> > > I still have no clear root cause for it, and would appreciate some
> >>> help.
> >>>> >
> >>>> > This was working with 3.7.9 from what I've heard.
> >>>>
> >>>> Do we have a simple test-case for this? If someone write a script, we
> >>>> should be able to "git bisect" it pretty quickly.
> >>>
> >>> I am doing this right now.
> >> "b33f3c9 glusterd: Bug fixes for IPv6 support" has caused this
> >> regression. I am yet to find the RCA though.
> >
> > git-bisect agrees with this as well.
> >
> > I initially thought it was because GlusterD didn't listen on IPv6
> > (checked using `ss`).
> > This change makes it so that connections to localhost use ::1 instead
> > of 127.0.0.1, and so the connection failed.
> > This should have caused all connection attempts to fail, irrespective
> > of it being encrypted or not.
> > But the failure only happens when management encryption is enabled.
> > So this theory doesn't make sense.
> 
> This is the part of the problem!
> 
> The initial IPv6 connection to ::1 fails for non encrypted connections as well.
> But these connections correctly retry connect with the next address
> once the first connect attempt fails.
> Since the next address is 127.0.0.1, the connection succeeds, volfile
> is fetched and the daemon starts.
> 
> Encrypted connections on the other hand, give up after the first
> failure and don't attempt a reconnect.
> This is somewhat surprising to me, as I'd recently fixed an issue
> which caused crashes when encrypted connections attempted a reconnect
> after a failure to connect.
> 
> I'll diagnose this a little bit more and try to find a solution.

Or revert the change since it was introduced in 3.7.10 and nobody relies
on that yet. Try to get it fixed properly for 3.7.12?

Niels

Attachment: signature.asc
Description: PGP signature

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://www.gluster.org/mailman/listinfo/gluster-devel

[Index of Archives]     [Gluster Users]     [Ceph Users]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux