Re: [PATCH net-next] selftests: net: give up on the cmsg_time accuracy on slow machines

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jakub Kicinski <kuba@xxxxxxxxxx> writes:

> Commit b9d5f5711dd8 ("selftests: net: increase the delay for relative
> cmsg_time.sh test") widened the accepted value range 8x but we still
> see flakes (at a rate of around 7%).
>
> Return XFAIL for the most timing sensitive test on slow machines.
>
> Before:
>
>   # ./cmsg_time.sh
>     Case UDPv4  - TXTIME rel returned '8074us - 7397us < 4000', expected 'OK'
>   FAIL - 1/36 cases failed
>
> After:
>
>   # ./cmsg_time.sh
>     Case UDPv4  - TXTIME rel returned '1123us - 941us < 500', expected 'OK' (XFAIL)
>     Case UDPv6  - TXTIME rel returned '1227us - 776us < 500', expected 'OK' (XFAIL)
>   OK
>
> Signed-off-by: Jakub Kicinski <kuba@xxxxxxxxxx>
> ---
> CC: shuah@xxxxxxxxxx
> CC: linux-kselftest@xxxxxxxxxxxxxxx
> CC: willemdebruijn.kernel@xxxxxxxxx
> ---
>  tools/testing/selftests/net/cmsg_time.sh | 35 +++++++++++++++++-------
>  1 file changed, 25 insertions(+), 10 deletions(-)
>
> diff --git a/tools/testing/selftests/net/cmsg_time.sh b/tools/testing/selftests/net/cmsg_time.sh
> index 1d7e756644bc..478af0aefa97 100755
> --- a/tools/testing/selftests/net/cmsg_time.sh
> +++ b/tools/testing/selftests/net/cmsg_time.sh
> @@ -34,13 +34,28 @@ BAD=0
>  TOTAL=0
>  
>  check_result() {
> +    local ret=$1
> +    local got=$2
> +    local exp=$3
> +    local case=$4
> +    local xfail=$5
> +    local xf=
> +    local inc=
> +
> +    if [ "$xfail" == "xfail" ]; then
> +	xf="(XFAIL)"
> +	inc=0
> +    else
> +	inc=1
> +    fi
> +
>      ((TOTAL++))
> -    if [ $1 -ne 0 ]; then
> -	echo "  Case $4 returned $1, expected 0"
> -	((BAD++))
> +    if [ $ret -ne 0 ]; then
> +	echo "  Case $case returned $ret, expected 0 $xf"
> +	((BAD+=inc))
>      elif [ "$2" != "$3" ]; then
> -	echo "  Case $4 returned '$2', expected '$3'"
> -	((BAD++))
> +	echo "  Case $case returned '$got', expected '$exp' $xf"
> +	((BAD+=inc))
>      fi
>  }
>  
> @@ -66,14 +81,14 @@ for i in "-4 $TGT4" "-6 $TGT6"; do
>  		 awk '/SND/ { if ($3 > 1000) print "OK"; }')
>  	check_result $? "$ts" "OK" "$prot - TXTIME abs"
>  
> -	[ "$KSFT_MACHINE_SLOW" = yes ] && delay=8000 || delay=1000
> +	[ "$KSFT_MACHINE_SLOW" = yes ] && xfail=xfail
>  
> -	ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d $delay |
> +	ts=$(ip netns exec $NS ./cmsg_sender -p $p $i 1234 -t -d 1000 |
>  		 awk '/SND/ {snd=$3}
>  		      /SCHED/ {sch=$3}
> -		      END { if (snd - sch > '$((delay/2))') print "OK";
> -			    else print snd, "-", sch, "<", '$((delay/2))'; }')
> -	check_result $? "$ts" "OK" "$prot - TXTIME rel"
> +		      END { if (snd - sch > 500) print "OK";
> +			    else print snd, "-", sch, "<", 500; }')
> +	check_result $? "$ts" "OK" "$prot - TXTIME rel" $xfail
>      done
>  done

This logging and xfail handling duplicates lib.sh. Would a patch like
below be OK with you? The gist of it is to just use check_err, log_test
and xfail_on_slow from lib.sh to achieve what the test open-codes.

I can send it as a follow-up instead if you prefer.

--- a/tools/testing/selftests/net/cmsg_time.sh
+++ b/tools/testing/selftests/net/cmsg_time.sh
@@ -29,19 +29,21 @@ ip -netns $NS addr add $IP6 dev dummy0
 # Need FQ for TXTIME
 ip netns exec $NS tc qdisc replace dev dummy0 root fq
 
-# Test
-BAD=0
-TOTAL=0
-
 check_result() {
-    ((TOTAL++))
-    if [ $1 -ne 0 ]; then
-	echo "  Case $4 returned $1, expected 0"
-	((BAD++))
-    elif [ "$2" != "$3" ]; then
-	echo "  Case $4 returned '$2', expected '$3'"
-	((BAD++))
-    fi
+    local ret=$1
+    local got=$2
+    local exp=$3
+    local case=$4
+
+    RET=0
+
+    [ $1 -eq 0 ]
+    check_err $? "Case $4 returned $1, expected 0"
+
+    [ "$2" == "$3" ]
+    check_err $? "Case $4 returned '$2', expected '$3'"
+
+    log_test "$4"
 }
 
 for i in "-4 $TGT4" "-6 $TGT6"; do
@@ -73,15 +79,8 @@ for i in "-4 $TGT4" "-6 $TGT6"; do
 		      /SCHED/ {sch=$3}
 		      END { if (snd - sch > '$((delay/2))') print "OK";
 			    else print snd, "-", sch, "<", '$((delay/2))'; }')
-	check_result $? "$ts" "OK" "$prot - TXTIME rel"
+	xfail_on_slow check_result $? "$ts" "OK" "$prot - TXTIME rel"
     done
 done
 
-# Summary
-if [ $BAD -ne 0 ]; then
-    echo "FAIL - $BAD/$TOTAL cases failed"
-    exit 1
-else
-    echo "OK"
-    exit 0
-fi
+exit $EXIT_STATUS

This is much more verbose, but that's how tests tend to be. I could
change it to only log on RET != 0, but like this the custom results
block can go away and the test is overall more median.

bash-5.2# KSFT_MACHINE_SLOW=yes ./cmsg_time.sh   
TEST: UDPv4  - no options                                           [ OK ]
TEST: UDPv4  - ts cnt                                               [ OK ]
TEST: UDPv4  - ts0 SCHED                                            [ OK ]
TEST: UDPv4  - ts0 SND                                              [ OK ]
TEST: UDPv4  - TXTIME abs                                           [ OK ]
TEST: UDPv4  - TXTIME rel                                           [XFAIL]
        Case UDPv4  - TXTIME rel returned '34us - 30us < 4000', expected 'OK'
TEST: ICMPv4  - no options                                          [ OK ]
TEST: ICMPv4  - ts cnt                                              [ OK ]
[... snip ...]
bash-5.2# echo $?
0




[Index of Archives]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Samba]     [Device Mapper]

  Powered by Linux