Re: t7006 sometimes hangs in cronjobs on OS X

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Feb 09, 2011 at 03:38:46PM +0100, Thomas Rast wrote:

> I have been running a nightly next&pu smoke tester each on RHEL5 and
> OS X.  For quite some time (at least a month), t7006 gets stuck
> randomly (i.e., not every night).  [This has been a bit of a problem
> because it keeps a lot of processes stuck on the machine and eats into
> my ulimit, but that's the case with any stuck process; I'll have to
> think about a good solution to that.]
> 
> The relevant part of 'ps xwww' is
> 
>   65211   ??  SN     0:00.03 /bin/sh t7006-pager.sh
>   65798   ??  SN     0:00.04 /usr/bin/perl /Users/trast/git-smoke/t/test-terminal.perl git --no-pager log
>   65846   ??  ZN     0:00.00 (git)
>   65847   ??  ZN     0:00.00 (perl5.10.0)

Hmm. The zombie git process implies to me that git has exited, but for
some reason we are still stuck in the copy-to-terminal loop and haven't
reaped it. But the zombie perl process confuses me.  We fork a second
time so that one process copies stderr and the other one copies stdout.
Is the second perl process the stderr copier, and we are still blocking
on copying stdout for some reason? But then why is the command name
different? Is /usr/bin/perl a wrapper script on your platform?

Have you tried running strace on the surviving perl process to see what
it's doing? Presumably it's just hung on a read() syscall.

Or maybe try stracing the whole thing from the start, which may be more
informative, like:

diff --git a/t/lib-terminal.sh b/t/lib-terminal.sh
index c383b57..f7e6b7f 100644
--- a/t/lib-terminal.sh
+++ b/t/lib-terminal.sh
@@ -13,7 +13,7 @@ test_expect_success 'set up terminal for tests' '
 				echo >&4 "test_terminal: need to declare TTY prerequisite"
 				return 127
 			fi
-			"$PERL_PATH" "$TEST_DIRECTORY"/test-terminal.perl "$@"
+			strace -f "$PERL_PATH" "$TEST_DIRECTORY"/test-terminal.perl "$@"
 		}
 	fi
 '

and giving us the stderr dump (or perhaps redirecting strace output to a
file and saving it)?

>     TEST_JOBS=6 nice make smoke

If you use TEST_JOBS=1, does it still happen? I can't imagine what race
condition would cause this, but there are other tests use
lib-terminal.sh, so perhaps there is some interference there.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]