Re: IMP: Release 3.10: RC1 Pending bugs (Need fixes by 21st Feb)

Atin Mukherjee <amukherj@xxxxxxxxxx> · Tue, 21 Feb 2017 18:35:47 +0530

On Tue, Feb 21, 2017 at 6:15 PM, Jeff Darcy <jdarcy@xxxxxxxxxx> wrote:
> >   2) Bug 1421590 - Bricks take up new ports upon volume restart after

> > add-brick op with brick mux enabled

> >     - Status: *Atin/Samikshan/Jeff*, any update on this?

> >       - Can we document this as a known issue? What would be the way to

> > get volume to use the older ports (a glusterd restart?)?

That would work, but is clearly less than ideal.

>

> Patch under review for master. Atin/Samikshan are we going to wait for

> this be backported? The last update was we need not consider this a

> blocker for the release, does that still hold?

I can review/backport this today, despite being on vacation (we're all

giving our legs a day off).  Does that help?

I don't think its a blocker. So irrespective of whether the patch gets in or not, the release should not be blocked. We can mark it as a known issue in the release note.

> >   4) Bug 1422769 - brick process crashes when glusterd is restarted

>

> Atin/Samikshan, thoughts on this?

Very slight possibility this was an after-effect of 1421721, which is

fixed.  Hard to tell, though, since I was never able to reproduce it

on my systems.

Again, not a blocker and I was unable to hit it with multiple attempts.

> >   5) Bug 1422781 - Transport endpoint not connected error seen on client

> > when glusterd is restarted

> >     - Status: Repro not clean across setups, still debugging the problem

>

> Atin/Samikshan, were we able to narrow this down, post attempts to

> reproduce it?

Still trying to figure out why this won't reproduce using the cluster.rc

stuff.  There seemed to be some possibility that it was related to the

amount of I/O that was active while GlusterD was restarted, or to things

being in containers, but haven't heard back from Atin.

Although I was able to hit this issue at first attempt on release-3.10 head when the bug was filed, but later I couldn't with multiple attempts. Not sure any patch went in between which has fixed it. Given its not consistent, probably we can live with it.

However I am a bit bothered about https://bugzilla.redhat.com/show_bug.cgi?id=1421724 where we see a flood of log entries in glusterd.log if a volume is restarted after turning on brick multiplexing and this scenario does look to be hit in the set up frequently. I did work with Samikshan & Gaurav and figured out a possible RCA. I have posted a patch https://review.gluster.org/#/c/16699, reviews please?

_______________________________________________

Gluster-devel mailing list

Gluster-devel@xxxxxxxxxxx

http://lists.gluster.org/mailman/listinfo/gluster-devel

-- 

~ Atin (atinm)

_______________________________________________
Gluster-devel mailing list
Gluster-devel@xxxxxxxxxxx
http://lists.gluster.org/mailman/listinfo/gluster-devel