----- Original Message ----- > From: "Atin Mukherjee" <amukherj@xxxxxxxxxx> > To: gluster-devel@xxxxxxxxxxx, "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > Sent: Wednesday, May 21, 2014 3:39:21 PM > Subject: Re: Fwd: Re: Spurious failures because of nfs and snapshots > > > > On 05/21/2014 11:42 AM, Atin Mukherjee wrote: > > > > > > On 05/21/2014 10:54 AM, SATHEESARAN wrote: > >> Guys, > >> > >> This is the issue pointed out by Pranith with regard to Barrier. > >> I was reading through it. > >> > >> But I wanted to bring it to concern > >> > >> -- S > >> > >> > >> -------- Original Message -------- > >> Subject: Re: Spurious failures because of nfs and > >> snapshots > >> Date: Tue, 20 May 2014 21:16:57 -0400 (EDT) > >> From: Pranith Kumar Karampuri <pkarampu@xxxxxxxxxx> > >> To: Vijaikumar M <vmallika@xxxxxxxxxx>, Joseph Fernandes > >> <josferna@xxxxxxxxxx> > >> CC: Gluster Devel <gluster-devel@xxxxxxxxxxx> > >> > >> > >> > >> Hey, > >> Seems like even after this fix is merged, the regression tests are > >> failing for the same script. You can check the logs at > >> http://build.gluster.org:443/logs/glusterfs-logs-20140520%3a14%3a06%3a46.tgz > > Pranith, > > > > Is this the correct link? I don't see any log having this sequence there. > > Also looking at the log from this mail, this is expected as per the > > barrier functionality, an enable request followed by another enable > > should always fail and the same happens for disable. > > > > Can you please confirm the link and which particular regression test is > > causing this issue, is it bug-1090042.t? > > > > --Atin > >> > >> Relevant logs: > >> [2014-05-20 20:17:07.026045] : volume create patchy > >> build.gluster.org:/d/backends/patchy1 > >> build.gluster.org:/d/backends/patchy2 : SUCCESS > >> [2014-05-20 20:17:08.030673] : volume start patchy : SUCCESS > >> [2014-05-20 20:17:08.279148] : volume barrier patchy enable : SUCCESS > >> [2014-05-20 20:17:08.476785] : volume barrier patchy enable : FAILED : > >> Failed to reconfigure barrier. > >> [2014-05-20 20:17:08.727429] : volume barrier patchy disable : SUCCESS > >> [2014-05-20 20:17:08.926995] : volume barrier patchy disable : FAILED : > >> Failed to reconfigure barrier. > >> > This log is for bug-1092841.t and its expected. Damn :-(. I think I screwed up the timestamps while checking.... Sorry about that :-(. But there are failures. Check http://build.gluster.org/job/regression/4501/consoleFull Pranith > > --Atin > >> Pranith > >> > >> ----- Original Message ----- > >>> From: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > >>> To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > >>> Cc: "Joseph Fernandes" <josferna@xxxxxxxxxx>, "Vijaikumar M" > >>> <vmallika@xxxxxxxxxx> > >>> Sent: Tuesday, May 20, 2014 3:41:11 PM > >>> Subject: Re: Spurious failures because of nfs and snapshots > >>> > >>> hi, > >>> Please resubmit the patches on top of > >>> http://review.gluster.com/#/c/7753 > >>> to prevent frequent regression failures. > >>> > >>> Pranith > >>> ----- Original Message ----- > >>>> From: "Vijaikumar M" <vmallika@xxxxxxxxxx> > >>>> To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > >>>> Cc: "Joseph Fernandes" <josferna@xxxxxxxxxx>, "Gluster Devel" > >>>> <gluster-devel@xxxxxxxxxxx> > >>>> Sent: Monday, May 19, 2014 2:40:47 PM > >>>> Subject: Re: Spurious failures because of nfs and snapshots > >>>> > >>>> Brick disconnected with ping-time out: > >>>> > >>>> Here is the log message > >>>> [2014-05-19 04:29:38.133266] I [MSGID: 100030] [glusterfsd.c:1998:main] > >>>> 0-/build/install/sbin/glusterfsd: Started running /build/install/sbi > >>>> n/glusterfsd version 3.5qa2 (args: /build/install/sbin/glusterfsd -s > >>>> build.gluster.org --volfile-id /snaps/patchy_snap1/3f2ae3fbb4a74587b1a9 > >>>> 1013f07d327f.build.gluster.org.var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3 > >>>> -p /var/lib/glusterd/snaps/patchy_snap1/3f2ae3f > >>>> bb4a74587b1a91013f07d327f/run/build.gluster.org-var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.pid > >>>> -S /var/run/51fe50a6faf0aa e006c815da946caf3a.socket --brick-name > >>>> /var/run/gluster/snaps/3f2ae3fbb4a74587b1a91013f07d327f/brick3 -l > >>>> /build/install/var/log/glusterfs/br > >>>> icks/var-run-gluster-snaps-3f2ae3fbb4a74587b1a91013f07d327f-brick3.log > >>>> --xlator-option *-posix.glusterd-uuid=494ef3cd-15fc-4c8c-8751-2d441ba > >>>> 7b4b0 --brick-port 49164 --xlator-option > >>>> 3f2ae3fbb4a74587b1a91013f07d327f-server.listen-port=49164) > >>>> 2 [2014-05-19 04:29:38.141118] I > >>>> [rpc-clnt.c:988:rpc_clnt_connection_init] 0-glusterfs: defaulting > >>>> ping-timeout to 30secs > >>>> 3 [2014-05-19 04:30:09.139521] C > >>>> [rpc-clnt-ping.c:105:rpc_clnt_ping_timer_expired] 0-glusterfs: server > >>>> 10.3.129.13:24007 has not responded in the last 30 seconds, > >>>> disconnecting. > >>>> > >>>> > >>>> > >>>> Patch 'http://review.gluster.org/#/c/7753/' will fix the problem, where > >>>> ping-timer will be disabled by default for all the rpc connection except > >>>> for glusterd-glusterd (set to 30sec) and client-glusterd (set to 42sec). > >>>> > >>>> > >>>> Thanks, > >>>> Vijay > >>>> > >>>> > >>>> On Monday 19 May 2014 11:56 AM, Pranith Kumar Karampuri wrote: > >>>>> The latest build failure also has the same issue: > >>>>> Download it from here: > >>>>> http://build.gluster.org:443/logs/glusterfs-logs-20140518%3a22%3a27%3a31.tgz > >>>>> > >>>>> Pranith > >>>>> > >>>>> ----- Original Message ----- > >>>>>> From: "Vijaikumar M" <vmallika@xxxxxxxxxx> > >>>>>> To: "Joseph Fernandes" <josferna@xxxxxxxxxx> > >>>>>> Cc: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx>, "Gluster Devel" > >>>>>> <gluster-devel@xxxxxxxxxxx> > >>>>>> Sent: Monday, 19 May, 2014 11:41:28 AM > >>>>>> Subject: Re: Spurious failures because of nfs and snapshots > >>>>>> > >>>>>> Hi Joseph, > >>>>>> > >>>>>> In the log mentioned below, it say ping-time is set to default value > >>>>>> 30sec.I think issue is different. > >>>>>> Can you please point me to the logs where you where able to re-create > >>>>>> the problem. > >>>>>> > >>>>>> Thanks, > >>>>>> Vijay > >>>>>> > >>>>>> > >>>>>> > >>>>>> On Monday 19 May 2014 09:39 AM, Pranith Kumar Karampuri wrote: > >>>>>>> hi Vijai, Joseph, > >>>>>>> In 2 of the last 3 build failures, > >>>>>>> http://build.gluster.org/job/regression/4479/console, > >>>>>>> http://build.gluster.org/job/regression/4478/console this > >>>>>>> test(tests/bugs/bug-1090042.t) failed. Do you guys think it is > >>>>>>> better > >>>>>>> to revert this test until the fix is available? Please send a > >>>>>>> patch > >>>>>>> to revert the test case if you guys feel so. You can re-submit > >>>>>>> it > >>>>>>> along with the fix to the bug mentioned by Joseph. > >>>>>>> > >>>>>>> Pranith. > >>>>>>> > >>>>>>> ----- Original Message ----- > >>>>>>>> From: "Joseph Fernandes" <josferna@xxxxxxxxxx> > >>>>>>>> To: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > >>>>>>>> Cc: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > >>>>>>>> Sent: Friday, 16 May, 2014 5:13:57 PM > >>>>>>>> Subject: Re: Spurious failures because of nfs and snapshots > >>>>>>>> > >>>>>>>> > >>>>>>>> Hi All, > >>>>>>>> > >>>>>>>> tests/bugs/bug-1090042.t : > >>>>>>>> > >>>>>>>> I was able to reproduce the issue i.e when this test is done in a > >>>>>>>> loop > >>>>>>>> > >>>>>>>> for i in {1..135} ; do ./bugs/bug-1090042.t > >>>>>>>> > >>>>>>>> When checked the logs > >>>>>>>> [2014-05-16 10:49:49.003978] I > >>>>>>>> [rpc-clnt.c:973:rpc_clnt_connection_init] > >>>>>>>> 0-management: setting frame-timeout to 600 > >>>>>>>> [2014-05-16 10:49:49.004035] I > >>>>>>>> [rpc-clnt.c:988:rpc_clnt_connection_init] > >>>>>>>> 0-management: defaulting ping-timeout to 30secs > >>>>>>>> [2014-05-16 10:49:49.004303] I > >>>>>>>> [rpc-clnt.c:973:rpc_clnt_connection_init] > >>>>>>>> 0-management: setting frame-timeout to 600 > >>>>>>>> [2014-05-16 10:49:49.004340] I > >>>>>>>> [rpc-clnt.c:988:rpc_clnt_connection_init] > >>>>>>>> 0-management: defaulting ping-timeout to 30secs > >>>>>>>> > >>>>>>>> The issue is with ping-timeout and is tracked under the bug > >>>>>>>> > >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1096729 > >>>>>>>> > >>>>>>>> > >>>>>>>> The workaround is mentioned in > >>>>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=1096729#c8 > >>>>>>>> > >>>>>>>> > >>>>>>>> Regards, > >>>>>>>> Joe > >>>>>>>> > >>>>>>>> ----- Original Message ----- > >>>>>>>> From: "Pranith Kumar Karampuri" <pkarampu@xxxxxxxxxx> > >>>>>>>> To: "Gluster Devel" <gluster-devel@xxxxxxxxxxx> > >>>>>>>> Cc: "Joseph Fernandes" <josferna@xxxxxxxxxx> > >>>>>>>> Sent: Friday, May 16, 2014 6:19:54 AM > >>>>>>>> Subject: Spurious failures because of nfs and snapshots > >>>>>>>> > >>>>>>>> hi, > >>>>>>>> In the latest build I fired for review.gluster.com/7766 > >>>>>>>> (http://build.gluster.org/job/regression/4443/console) failed > >>>>>>>> because > >>>>>>>> of > >>>>>>>> spurious failure. The script doesn't wait for nfs export to be > >>>>>>>> available. I fixed that, but interestingly I found quite a few > >>>>>>>> scripts > >>>>>>>> with same problem. Some of the scripts are relying on 'sleep > >>>>>>>> 5' > >>>>>>>> which > >>>>>>>> also could lead to spurious failures if the export is not > >>>>>>>> available > >>>>>>>> in 5 > >>>>>>>> seconds. We found that waiting for 20 seconds is better, but > >>>>>>>> 'sleep > >>>>>>>> 20' > >>>>>>>> would unnecessarily delay the build execution. So if you guys > >>>>>>>> are > >>>>>>>> going > >>>>>>>> to write any scripts which has to do nfs mounts, please do it > >>>>>>>> the > >>>>>>>> following way: > >>>>>>>> > >>>>>>>> EXPECT_WITHIN 20 "1" is_nfs_export_available; > >>>>>>>> TEST mount -t nfs -o vers=3 $H0:/$V0 $N0; > >>>>>>>> > >>>>>>>> Please review http://review.gluster.com/7773 :-) > >>>>>>>> > >>>>>>>> I saw one more spurious failure in a snapshot related script > >>>>>>>> tests/bugs/bug-1090042.t on the next build fired by Niels. > >>>>>>>> Joesph (CCed) is debugging it. He agreed to reply what he finds and > >>>>>>>> share > >>>>>>>> it > >>>>>>>> with us so that we won't introduce similar bugs in future. > >>>>>>>> > >>>>>>>> I encourage you guys to share what you fix to prevent spurious > >>>>>>>> failures > >>>>>>>> in > >>>>>>>> future. > >>>>>>>> > >>>>>>>> Thanks > >>>>>>>> Pranith > >>>>>>>> > >>>>>> > >>>> > >>>> > >>> > >> _______________________________________________ > >> Gluster-devel mailing list > >> Gluster-devel@xxxxxxxxxxx > >> http://supercolony.gluster.org/mailman/listinfo/gluster-devel > >> > >> > >> > _______________________________________________ Gluster-devel mailing list Gluster-devel@xxxxxxxxxxx http://supercolony.gluster.org/mailman/listinfo/gluster-devel