Re: Weird (seemingly flakey) p4 breakage in t9833

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Gábor,

On Wed, 6 Feb 2019, SZEDER Gábor wrote:

> On Wed, Feb 06, 2019 at 11:33:21AM +0100, Johannes Schindelin wrote:
> 
> > in a private Azure Pipeline (sorry...) I noticed an intermittent problem
> > in the p4 tests on osx-gcc.
> > 
> > I would point you to a public log, but the Azure Pipelines support *just*
> > made it to next, so I *just* set up a public one targeting anything else
> > than my `vsts-ci` branch, at
> > https://dev.azure.com/gitgitgadget/git/_build/index?definitionId=6. And
> > those builds do not show that problem, so it must be a flakey test.
> > 
> > But maybe you can spot anything familiar from the log?
> > 
> > -- snip --
> > [...]
> > ++ P4TICKETS='/Users/vsts/agent/2.146.0/work/1/s/t/trash
> > directory.t9833-errors/cli/tickets'
> > ++ P4USER=short_expiry_user
> > ++ echo password
> > ++ p4 login
> > Enter password: 
> > User short_expiry_user logged in.
> > Perforce db files in '.' will be created if missing...
> > ++ cd '/Users/vsts/agent/2.146.0/work/1/s/t/trash
> > directory.t9833-errors/git'
> > ++ git p4 sync
> > ++ true
> > +++ time_in_seconds
> > +++ cd /
> > +++ /usr/bin/python -c 'import time; print(int(time.time()))'
> > ++ test 1549411312 -gt 1549411605
> > ++ sleep 1
> > Perforce db files in '.' will be created if missing...
> > failure accessing depot: perforce ticket expires in 1 seconds
> > Performing incremental import into refs/remotes/p4/master git branch
> > Depot paths: //depot/
> > error: last command exited with $?=1
> > ++ true
> > +++ time_in_seconds
> > +++ cd /
> > +++ /usr/bin/python -c 'import time; print(int(time.time()))'
> > ++ test 1549411314 -gt 1549411605
> > ++ sleep 1
> > not ok 6 - git operation with expired ticket
> > #	
> > #		P4TICKETS="$cli/tickets" &&
> > #		P4USER=short_expiry_user &&
> > #		echo "password" | p4 login &&
> > #		(
> > #			cd "$git" &&
> > #			git p4 sync &&
> > #			sleep 5 &&
> > #			test_must_fail git p4 sync 2>errmsg &&
> > #			grep "failure accessing depot" errmsg
> > #		)
> > #	
> > -- snap --
> 
> I saw this on Travis CI a couple of times, and looked into it, though
> I have no idea how p4 is supposed to work...  anyway, my theory is:
> 
> The previous test 'create group with short ticket expiry' creates a
> "ticket" with 3 seconds expiration time, to be used in this failing
> one.  This timeout might be just a bit too short when running the test
> under high load, and it takes long enough to reach the first 'git p4
> sync', so long that the timeout is (almost?) up, and then 'git p4'
> errors out.

That sounds very plausible to me, thank you for the analysis!

> I'm not sure what the proper solution would be (assuming that my
> theory is right, that is): increasing the ticket timeout to a "must be
> surely long enough" value would require longer 'sleep' in this test,
> which is not good.  I wonder why that failing 'git p4 sync' is
> necessary in the first place, and whether it's really necessary to
> test ticket expiration, but then again: I have no idea how p4 works.
> 
> On a related note, I think it would be better if these two tests were
> squashed into one, so we would get the whole picture.

Agreed.

> > BTW I find it very odd to see a `sleep 1` in the trace but not in the
> > snippet (there is only a `sleep 5` instead, which I fail to see in the
> > trace). Odd?
> 
> Intentional.  p4d seems to be prone to hang, to circumvent this
> start_p4d() from 'lib-git-p4.sh' starts a watchdog process in the
> background to kill it after a long-enough timeout is up.

What I found odd was that I did not see that `sleep 1` at all in the
trace, but now it makes sense.

> These three lines in the log:
> 
> > +++ /usr/bin/python -c 'import time; print(int(time.time()))'
> > ++ test 1549411312 -gt 1549411605
> > ++ sleep 1
> 
> come from that background process.
> 
> A couple of cleanup patches on top of your 'test_atexit' will
> eventually get rid of it...  once I get around to clean them up :)

Heh, so there *will* be something useful coming out of that aborted side
track? Great!

Thank you,
Dscho

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux