> > Hi Raghavendra, > > > > The link mentioned seems to be invalid. My bad. Sorry about that. > Just to be clear, we have 3 patches in question here - > > [1] Original patch (merged in master but not in release-3.7) > http://review.gluster.org/13456 > > There are two patches proposed to fix the regression > [2] http://review.gluster.org/#/c/13587/ > [3] http://review.gluster.org/#/c/13592/ > > Since the patch by Kotresh [3] completely fixes the regression, we have > decided to abandon [2]. In addition, since these fixes look very > intricate, till we are sure that the code is stable, we thought it may > be better not to back-port these patches to stable release branch. > > Please confirm if you are implying to revive [2] and back-port all > these 3 patches to release-3.7 branch? [3] is needed for master. [1] and [3] are needed for release-3.7 branch. [2] can stay abandoned. > > Thanks, > Soumya > > > 3. Kotresh will work on new changes to make sure changelog makes > > correct use of rpc-clnt. > > > > [1] http://review.gluster.org/#/c/13592 > > [2] http://review.gluster.org/#/c/1359 > > > > regards, > > Raghavendra. > > > > > > Thanks and Regards, > > Kotresh H R > > > > ----- Original Message ----- > > > From: "Soumya Koduri" <skoduri@xxxxxxxxxx > > > <mailto:skoduri@xxxxxxxxxx>> > > > To: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx > > > <mailto:khiremat@xxxxxxxxxx>>, "Raghavendra > > G" <raghavendra@xxxxxxxxxxx <mailto:raghavendra@xxxxxxxxxxx>> > > > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx > > > <mailto:gluster-devel@xxxxxxxxxxx>> > > > Sent: Thursday, March 3, 2016 5:06:00 PM > > > Subject: Re: Cores generated with > > ./tests/geo-rep/georep-basic-dr-tarssh.t > > > > > > > > > > > > On 03/03/2016 04:58 PM, Kotresh Hiremath Ravishankar wrote: > > > > [Replying on top of my own reply] > > > > > > > > Hi, > > > > > > > > I have submitted the below patch [1] to avoid the issue of > > > > 'rpc_clnt_submit' > > > > getting reconnected. But it won't take care of memory leak > > problem you were > > > > trying to fix. That we have to carefully go through all > > cases and fix it. > > > > Please have a look at it. > > > > > > > Looks good. IIUC, with this patch, we need not set > > conn->trans to NULL > > > in 'rpc_clnt_disable()'. Right? If yes, then it takes care of > > memleak as > > > the transport object shall then get freed as part of > > > 'rpc_clnt_trigger_destroy'. > > > > > > > > > > http://review.gluster.org/#/c/13592/ > > > > > > > > Thanks and Regards, > > > > Kotresh H R > > > > > > > > ----- Original Message ----- > > > >> From: "Kotresh Hiremath Ravishankar" <khiremat@xxxxxxxxxx > > <mailto:khiremat@xxxxxxxxxx>> > > > >> To: "Soumya Koduri" <skoduri@xxxxxxxxxx > > <mailto:skoduri@xxxxxxxxxx>> > > > >> Cc: "Raghavendra G" <raghavendra@xxxxxxxxxxx > > <mailto:raghavendra@xxxxxxxxxxx>>, "Gluster Devel" > > > >> <gluster-devel@xxxxxxxxxxx > > > >> <mailto:gluster-devel@xxxxxxxxxxx>> > > > >> Sent: Thursday, March 3, 2016 3:39:11 PM > > > >> Subject: Re: Cores generated with > > > >> ./tests/geo-rep/georep-basic-dr-tarssh.t > > > >> > > > >> Hi Soumya, > > > >> > > > >> I tested the lastes patch [2] on master where your > > previous patch [1] in > > > >> merged. > > > >> I see crashes at different places. > > > >> > > > >> 1. If there are code paths that are holding rpc object > > without taking ref > > > >> on > > > >> it, all those > > > >> code path will crash on invoking rpc submit on that > > object as rpc > > > >> object > > > >> would have freed > > > >> by last unref on DISCONNECT event. I see this kind of > > use-case in > > > >> chagnelog rpc code. > > > >> Need to check on other users of rpc. > > > Agree. We should fix all such code-paths. Since this seem to > > be an > > > intricate fix, shall we take these patches only in master > > branch and not > > > in 3.7 release for now till we fix all such paths as we > > encounter? > > > > > > >> > > > >> 2. And also we need to take care of reconnect timers that > > are being set > > > >> and > > > >> are re-tried to > > > >> connect back on expiration. In those cases also, we > > might crash as rpc > > > >> object would have freed. > > > Your patch addresses this..right? > > > > > > Thanks, > > > Soumya > > > > > > >> > > > >> > > > >> [1] http://review.gluster.org/#/c/13507/ > > > >> [2] http://review.gluster.org/#/c/13587/ > > > >> > > > >> Thanks and Regards, > > > >> Kotresh H R > > > >> > > > >> ----- Original Message ----- > > > >>> From: "Soumya Koduri" <skoduri@xxxxxxxxxx > > <mailto:skoduri@xxxxxxxxxx>> > > > >>> To: "Raghavendra G" <raghavendra@xxxxxxxxxxx > > <mailto:raghavendra@xxxxxxxxxxx>>, "Kotresh Hiremath > > > >>> Ravishankar" <khiremat@xxxxxxxxxx > > <mailto:khiremat@xxxxxxxxxx>> > > > >>> Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx > > <mailto:gluster-devel@xxxxxxxxxxx>> > > > >>> Sent: Thursday, March 3, 2016 12:24:00 PM > > > >>> Subject: Re: Cores generated with > > > >>> ./tests/geo-rep/georep-basic-dr-tarssh.t > > > >>> > > > >>> Thanks a lot Kotresh. > > > >>> > > > >>> On 03/03/2016 08:47 AM, Raghavendra G wrote: > > > >>>> Hi Soumya, > > > >>>> > > > >>>> Can you send a fix to this regression on upstream master > > too? This patch > > > >>>> is merged there. > > > >>>> > > > >>> I have submitted below patch. > > > >>> http://review.gluster.org/#/c/13587/ > > > >>> > > > >>> Kindly review the same. > > > >>> > > > >>> Thanks, > > > >>> Soumya > > > >>> > > > >>>> regards, > > > >>>> Raghavendra > > > >>>> > > > >>>> On Tue, Mar 1, 2016 at 10:34 PM, Kotresh Hiremath > > Ravishankar > > > >>>> <khiremat@xxxxxxxxxx <mailto:khiremat@xxxxxxxxxx> > > <mailto:khiremat@xxxxxxxxxx <mailto:khiremat@xxxxxxxxxx>>> wrote: > > > >>>> > > > >>>> Hi Soumya, > > > >>>> > > > >>>> I analysed the issue and found out that crash has > > happened because > > > >>>> of the patch [1]. > > > >>>> > > > >>>> The patch doesn't set transport object to NULL in > > > >>>> 'rpc_clnt_disable' > > > >>>> but instead does it on > > > >>>> 'rpc_clnt_trigger_destroy'. So if there are pending > > rpc invocations > > > >>>> on the rpc object that > > > >>>> is disabled (those instances are possible as > > happening now in > > > >>>> changelog), it will trigger a > > > >>>> CONNECT notify again with 'mydata' that is freed > > causing a crash. > > > >>>> This happens because > > > >>>> 'rpc_clnt_submit' reconnects if rpc is not connected. > > > >>>> > > > >>>> rpc_clnt_submit (...) { > > > >>>> ... > > > >>>> if (conn->connected == 0) { > > > >>>> ret = > > rpc_transport_connect (conn->trans, > > > >>>> > > > >>>> conn->config.remote_port); > > > >>>> } > > > >>>> ... > > > >>>> } > > > >>>> > > > >>>> Without your patch, conn->trans was set NULL and > > hence CONNECT > > > >>>> fails > > > >>>> not resulting with > > > >>>> CONNECT notify call. And also the cleanup happens > > in failure path. > > > >>>> > > > >>>> So the memory leak can happen, if there is no try > > for rpc > > > >>>> invocation > > > >>>> after DISCONNECT. > > > >>>> It will be cleaned up otherwise. > > > >>>> > > > >>>> > > > >>>> [1] http://review.gluster.org/#/c/13507/ > > > >>>> > > > >>>> Thanks and Regards, > > > >>>> Kotresh H R > > > >>>> > > > >>>> ----- Original Message ----- > > > >>>> > From: "Kotresh Hiremath Ravishankar" > > <khiremat@xxxxxxxxxx <mailto:khiremat@xxxxxxxxxx> > > > >>>> <mailto:khiremat@xxxxxxxxxx > > <mailto:khiremat@xxxxxxxxxx>>> > > > >>>> > To: "Soumya Koduri" <skoduri@xxxxxxxxxx > > <mailto:skoduri@xxxxxxxxxx> > > > >>>> > <mailto:skoduri@xxxxxxxxxx > > <mailto:skoduri@xxxxxxxxxx>>> > > > >>>> > Cc: avishwan@xxxxxxxxxx > > <mailto:avishwan@xxxxxxxxxx> <mailto:avishwan@xxxxxxxxxx > > <mailto:avishwan@xxxxxxxxxx>>, "Gluster > > > >>>> Devel" <gluster-devel@xxxxxxxxxxx > > <mailto:gluster-devel@xxxxxxxxxxx> > > > >>>> <mailto:gluster-devel@xxxxxxxxxxx > > <mailto:gluster-devel@xxxxxxxxxxx>>> > > > >>>> > Sent: Monday, February 29, 2016 4:15:22 PM > > > >>>> > Subject: Re: Cores generated with > > > >>>> ./tests/geo-rep/georep-basic-dr-tarssh.t > > > >>>> > > > > >>>> > Hi Soumya, > > > >>>> > > > > >>>> > I just tested that it is reproducible only with > > your patch both > > > >>>> in master and > > > >>>> > 3.76 branch. > > > >>>> > The geo-rep test cases are marked bad in master. > > So it's not hit > > > >>>> in master. > > > >>>> > rpc is introduced > > > >>>> > in changelog xlator to communicate to > > applications via > > > >>>> libgfchangelog. > > > >>>> > Venky/Me will check > > > >>>> > why is the crash happening and will update. > > > >>>> > > > > >>>> > > > > >>>> > Thanks and Regards, > > > >>>> > Kotresh H R > > > >>>> > > > > >>>> > ----- Original Message ----- > > > >>>> > > From: "Soumya Koduri" <skoduri@xxxxxxxxxx > > <mailto:skoduri@xxxxxxxxxx> > > > >>>> <mailto:skoduri@xxxxxxxxxx > > <mailto:skoduri@xxxxxxxxxx>>> > > > >>>> > > To: avishwan@xxxxxxxxxx > > <mailto:avishwan@xxxxxxxxxx> <mailto:avishwan@xxxxxxxxxx > > <mailto:avishwan@xxxxxxxxxx>>, > > > >>>> > > "kotresh" > > > >>>> <khiremat@xxxxxxxxxx <mailto:khiremat@xxxxxxxxxx> > > <mailto:khiremat@xxxxxxxxxx <mailto:khiremat@xxxxxxxxxx>>> > > > >>>> > > Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx > > <mailto:gluster-devel@xxxxxxxxxxx> > > > >>>> <mailto:gluster-devel@xxxxxxxxxxx > > <mailto:gluster-devel@xxxxxxxxxxx>>> > > > >>>> > > Sent: Monday, February 29, 2016 2:10:51 PM > > > >>>> > > Subject: Cores generated with > > > >>>> ./tests/geo-rep/georep-basic-dr-tarssh.t > > > >>>> > > > > > >>>> > > Hi Aravinda/Kotresh, > > > >>>> > > > > > >>>> > > With [1], I consistently see cores generated > > with the test > > > >>>> > > './tests/geo-rep/georep-basic-dr-tarssh.t' in > > release-3.7 > > > >>>> branch. From > > > >>>> > > the cores, looks like we are trying to > > dereference a freed > > > >>>> > > changelog_rpc_clnt_t(crpc) object in > > changelog_rpc_notify(). > > > >>>> Strangely > > > >>>> > > this was not reported in master branch. > > > >>>> > > > > > >>>> > > I tried debugging but couldn't find any > > possible suspects. I > > > >>>> request you > > > >>>> > > to take a look and let me know if [1] caused > > any regression. > > > >>>> > > > > > >>>> > > Thanks, > > > >>>> > > Soumya > > > >>>> > > > > > >>>> > > [1] http://review.gluster.org/#/c/13507/ > > > >>>> > > > > > >>>> > > > > >>>> _______________________________________________ > > > >>>> Gluster-devel mailing list > > > >>>> Gluster-devel@xxxxxxxxxxx > > <mailto:Gluster-devel@xxxxxxxxxxx> > > <mailto:Gluster-devel@xxxxxxxxxxx > > <mailto:Gluster-devel@xxxxxxxxxxx>> > > > >>>> http://www.gluster.org/mailman/listinfo/gluster-devel > > > >>>> > > > >>>> > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> Raghavendra G > > > >>> > > > >> _______________________________________________ > > > >> Gluster-devel mailing list > > > >> Gluster-devel@xxxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxxx> > > > >> http://www.gluster.org/mailman/listinfo/gluster-devel > > > >> > > > > > _______________________________________________ > > Gluster-devel mailing list > > Gluster-devel@xxxxxxxxxxx <mailto:Gluster-devel@xxxxxxxxxxx> > > http://www.gluster.org/mailman/listinfo/gluster-devel > > > > > > > > > > -- > > Raghavendra G > > > > > > > > > > -- > > Raghavendra G > _______________________________________________ > Gluster-devel mailing list > Gluster-devel@xxxxxxxxxxx > http://www.gluster.org/mailman/listinfo/gluster-devel > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://www.gluster.org/mailman/listinfo/gluster-devel