The patch below does not apply to the 4.14-stable tree. If someone wants it applied there, or to any other stable or longterm tree, then please email the backport, including the original git commit id to <stable@xxxxxxxxxxxxxxx>. To reproduce the conflict and resubmit, you may use the following commands: git fetch https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/ linux-4.14.y git checkout FETCH_HEAD git cherry-pick -x 4e7d2a8f0b52abf23b1dc13b3d88bc0923383cd5 # <resolve conflicts, build, test, etc.> git commit -s git send-email --to '<stable@xxxxxxxxxxxxxxx>' --in-reply-to '167818283215417@xxxxxxxxx' --subject-prefix 'PATCH 4.14.y' HEAD^.. Possible dependencies: 4e7d2a8f0b52 ("ktest.pl: Add RUN_TIMEOUT option with default unlimited") 3e1d3678844b ("ktest: Add CONNECT_TIMEOUT to change the connection timeout time") thanks, greg k-h ------------------ original commit in Linus's tree ------------------ >From 4e7d2a8f0b52abf23b1dc13b3d88bc0923383cd5 Mon Sep 17 00:00:00 2001 From: Steven Rostedt <rostedt@xxxxxxxxxxx> Date: Wed, 18 Jan 2023 16:37:25 -0500 Subject: [PATCH] ktest.pl: Add RUN_TIMEOUT option with default unlimited There is a disconnect between the run_command function and the wait_for_input. The wait_for_input has a default timeout of 2 minutes. But if that happens, the run_command loop will exit out to the waitpid() of the executing command. This fails in that it no longer monitors the command, and also, the ssh to the test box can hang when its finished, as it's waiting for the pipe it's writing to to flush, but the loop that reads that pipe has already exited, leaving the command stuck, and the test hangs. Instead, make the default "wait_for_input" of the run_command infinite, and allow the user to override it if they want with a default timeout option "RUN_TIMEOUT". But this fixes the hang that happens when the pipe is full and the ssh session never exits. Cc: stable@xxxxxxxxxxxxxxx Fixes: 6e98d1b4415fe ("ktest: Add timeout to ssh command") Signed-off-by: Steven Rostedt <rostedt@xxxxxxxxxxx> diff --git a/tools/testing/ktest/ktest.pl b/tools/testing/ktest/ktest.pl index 74801811372f..822794ca4029 100755 --- a/tools/testing/ktest/ktest.pl +++ b/tools/testing/ktest/ktest.pl @@ -178,6 +178,7 @@ my $store_failures; my $store_successes; my $test_name; my $timeout; +my $run_timeout; my $connect_timeout; my $config_bisect_exec; my $booted_timeout; @@ -340,6 +341,7 @@ my %option_map = ( "STORE_SUCCESSES" => \$store_successes, "TEST_NAME" => \$test_name, "TIMEOUT" => \$timeout, + "RUN_TIMEOUT" => \$run_timeout, "CONNECT_TIMEOUT" => \$connect_timeout, "CONFIG_BISECT_EXEC" => \$config_bisect_exec, "BOOTED_TIMEOUT" => \$booted_timeout, @@ -1858,6 +1860,14 @@ sub run_command { $command =~ s/\$SSH_USER/$ssh_user/g; $command =~ s/\$MACHINE/$machine/g; + if (!defined($timeout)) { + $timeout = $run_timeout; + } + + if (!defined($timeout)) { + $timeout = -1; # tell wait_for_input to wait indefinitely + } + doprint("$command ... "); $start_time = time; @@ -1884,13 +1894,10 @@ sub run_command { while (1) { my $fp = \*CMD; - if (defined($timeout)) { - doprint "timeout = $timeout\n"; - } my $line = wait_for_input($fp, $timeout); if (!defined($line)) { my $now = time; - if (defined($timeout) && (($now - $start_time) >= $timeout)) { + if ($timeout >= 0 && (($now - $start_time) >= $timeout)) { doprint "Hit timeout of $timeout, killing process\n"; $hit_timeout = 1; kill 9, $pid; @@ -2062,6 +2069,11 @@ sub wait_for_input { $time = $timeout; } + if ($time < 0) { + # Negative number means wait indefinitely + undef $time; + } + $rin = ''; vec($rin, fileno($fp), 1) = 1; vec($rin, fileno(\*STDIN), 1) = 1; diff --git a/tools/testing/ktest/sample.conf b/tools/testing/ktest/sample.conf index 2d0fe15a096d..f43477a9b857 100644 --- a/tools/testing/ktest/sample.conf +++ b/tools/testing/ktest/sample.conf @@ -817,6 +817,11 @@ # is issued instead of a reboot. # CONNECT_TIMEOUT = 25 +# The timeout in seconds for how long to wait for any running command +# to timeout. If not defined, it will let it go indefinitely. +# (default undefined) +#RUN_TIMEOUT = 600 + # In between tests, a reboot of the box may occur, and this # is the time to wait for the console after it stops producing # output. Some machines may not produce a large lag on reboot