Hi Damien, On Feb 23 10:28, Corinna Vinschen wrote: > On Feb 22 07:59, Damien Miller wrote: > > On Sat, 21 Feb 2015, Corinna Vinschen wrote: > > > - The failing last loop in the "forwarding" script as reported back > > > during 6.7 testing is still failing for me more often than not. It's > > > always the same reason, the script tries to use in-use port numbers. > > > Reducing the forwarding script to only this last test loop succeeds > > > every time, but is quite a hack for testing. > > > > Is it colliding with itself or with other services running on your > > test host? (especially with ones windows starts itself) > > It's colliding with itself, afaics. It tries to use ports 3301/3302 > again after it already used it in a former test in the script, and with > a high probability they are still taken. The same ports are used for > earlier tests in the same script. On the testmachine, only very few > ports are taken by Windows processes and only one of them (RDP, port > 3389) in the range used by the tests. > > Note that this is still the same as described in > http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032842.html > > See also > http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032854.html > > And this workaround still lets the test succeed: > http://lists.mindrot.org/pipermail/openssh-unix-dev/2014-August/032862.html I think I'm a step closer to a solution. I just added debug output to the forwarding.sh script and it turns out that the test prior to the "transfer over chained unix domain socket forwards ..." test, namely echo "LocalForward ${base}01 127.0.0.1:$PORT" >> $OBJ/ssh_config echo "RemoteForward ${base}02 127.0.0.1:${base}01" >> $OBJ/ssh_config for p in 1 2; do trace "config file: start forwarding, fork to background" ${SSH} -$p -F $OBJ/ssh_config -f somehost sleep 10 trace "config file: transfer over forwarded channels and check result" ${SSH} -F $OBJ/ssh_config -p${base}02 -o 'ConnectionAttempts=4' \ somehost cat ${DATA} > ${COPY} test -s ${COPY} || fail "failed copy of ${DATA}" cmp ${DATA} ${COPY} || fail "corrupted copy of ${DATA}" wait done leaves the ports 3301/3302 in TIME_WAIT state (as is 4242 from some earlier test). Here are the relevant excerpts from ps -e and (Windows) netstat output. The first group is the output prior to the above test: PID PPID PGID WINPID TTY UID STIME COMMAND 1320 3632 1320 3964 ? 1049577 11:19:26 /home/corinna/sshbuild/bin/sshd 3512 3632 3512 3444 ? 1049577 11:19:27 /home/corinna/sshbuild/bin/sshd 3632 1 3632 3632 ? 1049577 11:17:49 /home/corinna/sshbuild/bin/sshd Active Connections Proto Local Address Foreign Address State PID TCP 127.0.0.1:4242 0.0.0.0:0 LISTENING 3632 TCP 127.0.0.1:61665 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61666 127.0.0.1:61667 TIME_WAIT 0 TCP 127.0.0.1:61668 127.0.0.1:61669 TIME_WAIT 0 TCP 127.0.0.1:61673 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61674 127.0.0.1:61675 TIME_WAIT 0 TCP 127.0.0.1:61676 127.0.0.1:61677 TIME_WAIT 0 TCP 127.0.0.1:61679 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61680 127.0.0.1:61681 TIME_WAIT 0 TCP 127.0.0.1:61682 127.0.0.1:61683 TIME_WAIT 0 TCP 127.0.0.1:61686 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61687 127.0.0.1:61688 TIME_WAIT 0 TCP 127.0.0.1:61689 127.0.0.1:61690 TIME_WAIT 0 TCP 127.0.0.1:61691 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61692 127.0.0.1:61693 TIME_WAIT 0 TCP 127.0.0.1:61694 127.0.0.1:61695 TIME_WAIT 0 the second group is the output from right between the above test and the "transfer over chained unix domain socket forwards..." PID PPID PGID WINPID TTY UID STIME COMMAND 3112 376 376 3212 ? 1049577 11:19:29 /usr/bin/sleep 3520 388 388 3716 ? 1049577 11:19:32 /usr/bin/sleep 3056 3632 3056 2100 ? 1049577 11:19:31 /home/corinna/sshbuild/bin/sshd 4048 1 4048 4048 ? 1049577 11:19:31 /home/corinna/sshbuild/bin/ssh 2372 3632 2372 4024 ? 1049577 11:19:33 /home/corinna/sshbuild/bin/sshd 2728 1 2728 2728 ? 1049577 11:19:28 /home/corinna/sshbuild/bin/ssh 2908 3632 2908 328 ? 1049577 11:19:28 /home/corinna/sshbuild/bin/sshd 3632 1 3632 3632 ? 1049577 11:17:49 /home/corinna/sshbuild/bin/sshd Active Connections Proto Local Address Foreign Address State PID TCP 127.0.0.1:3301 0.0.0.0:0 LISTENING 2640 TCP 127.0.0.1:3301 127.0.0.1:61714 CLOSE_WAIT 2640 TCP 127.0.0.1:3302 0.0.0.0:0 LISTENING 328 TCP 127.0.0.1:3302 127.0.0.1:61713 CLOSE_WAIT 328 TCP 127.0.0.1:4242 0.0.0.0:0 LISTENING 3632 TCP 127.0.0.1:4242 127.0.0.1:61696 ESTABLISHED 3632 TCP 127.0.0.1:4242 127.0.0.1:61708 ESTABLISHED 3632 TCP 127.0.0.1:4242 127.0.0.1:61715 CLOSE_WAIT 3632 TCP 127.0.0.1:61673 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61679 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61680 127.0.0.1:61681 TIME_WAIT 0 TCP 127.0.0.1:61682 127.0.0.1:61683 TIME_WAIT 0 TCP 127.0.0.1:61686 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61687 127.0.0.1:61688 TIME_WAIT 0 TCP 127.0.0.1:61689 127.0.0.1:61690 TIME_WAIT 0 TCP 127.0.0.1:61691 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61692 127.0.0.1:61693 TIME_WAIT 0 TCP 127.0.0.1:61694 127.0.0.1:61695 TIME_WAIT 0 TCP 127.0.0.1:61696 127.0.0.1:4242 ESTABLISHED 2640 TCP 127.0.0.1:61697 127.0.0.1:61698 TIME_WAIT 0 TCP 127.0.0.1:61699 127.0.0.1:61700 TIME_WAIT 0 TCP 127.0.0.1:61701 127.0.0.1:3302 TIME_WAIT 0 TCP 127.0.0.1:61702 127.0.0.1:3301 TIME_WAIT 0 TCP 127.0.0.1:61703 127.0.0.1:4242 TIME_WAIT 0 TCP 127.0.0.1:61704 127.0.0.1:61705 TIME_WAIT 0 TCP 127.0.0.1:61706 127.0.0.1:61707 TIME_WAIT 0 TCP 127.0.0.1:61708 127.0.0.1:4242 ESTABLISHED 2988 TCP 127.0.0.1:61709 127.0.0.1:61710 TIME_WAIT 0 TCP 127.0.0.1:61711 127.0.0.1:61712 TIME_WAIT 0 TCP 127.0.0.1:61713 127.0.0.1:3302 FIN_WAIT_2 3380 TCP 127.0.0.1:61714 127.0.0.1:3301 FIN_WAIT_2 2728 TCP 127.0.0.1:61715 127.0.0.1:4242 FIN_WAIT_2 328 TCP 127.0.0.1:61716 127.0.0.1:61717 TIME_WAIT 0 TCP 127.0.0.1:61718 127.0.0.1:61719 TIME_WAIT 0 This may well be a problem local to Windows. Btw., the large number of AF_INET sockets is a result of the way how Cygwin implements AF_LOCAL sockets: They are emulated by local AF_INET sockets since WIndows doesn't know the concept of AF_LOCAL sockets. Note that there are still sleep processes running. So on a hunch I just added a `sleep 30' between the two tests and, lo and behold, the forwarding.sh test completes successfully every time: diff --git a/regress/forwarding.sh b/regress/forwarding.sh index f799d49..1489407 100644 --- a/regress/forwarding.sh +++ b/regress/forwarding.sh @@ -120,6 +120,8 @@ for p in 1 2; do wait done +sleep 30 + for p in 2; do trace "transfer over chained unix domain socket forwards and check result" rm -f $OBJ/unix-[123].fwd Thanks, Corinna -- Corinna Vinschen Cygwin Maintainer Red Hat
Attachment:
pgp4LUNbeKbJD.pgp
Description: PGP signature
_______________________________________________ openssh-unix-dev mailing list openssh-unix-dev@xxxxxxxxxxx https://lists.mindrot.org/mailman/listinfo/openssh-unix-dev