[PATCH 3/3] check: add --print-start-done to enhance watchdogs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



fstests specific watchdogs want to know when the full test suite will
start and end. Right now the kernel ring buffer can get augmented but we
can't know for sure if it was due to a test or some odd hardware issue
after fstests ran. This is specially true for systems left running tests in
loops in automation where we are not running things ourselves but rather just
get access to kernel logs, or for filesystem runner watdogs such as the one
in kdevops [0]. It is also often not easy to determine for sure based on
just logs when fstests check really has completed unless we have a
matching log of who spawned that test runner. Although we could keep track of
this ourselves by an annotation locally on the test runner, it is useful to
have independent tools which are not attached to the process which spawned
check to just peak into a system and verify the system's progress with
fstests by just using the kernel log. Keeping this in the test target kernel
ring buffer enables these use cases.

This is useful for example for filesyste checker specific watchdogs like the
one in kdevops so that the watchdog knows when to start hunting for crashes
based just on the kernel ring buffer, and so it also knows when the show is
over.

[0] https://github.com/linux-kdevops/kdevops/blob/master/scripts/workflows/fstests/fstests_watchdog.py

Demo:

root@demo-xfs-reflink /var/lib/xfstests # dmesg -c > /dev/null
root@demo-xfs-reflink /var/lib/xfstests # ./check -s xfs_reflink --print-start-done generic/002
SECTION       -- xfs_reflink
RECREATING    -- xfs on /dev/loop16
FSTYP         -- xfs (non-debug)
PLATFORM      -- Linux/x86_64 demo-xfs-reflink 6.5.0-5-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.5.13-1 (2023-11-29)
MKFS_OPTIONS  -- -f -f -m reflink=1,rmapbt=1, -i sparse=1, /dev/loop5
MOUNT_OPTIONS -- /dev/loop5 /media/scratch

generic/002        1s
Ran: generic/002
Passed all 1 tests

SECTION       -- xfs_reflink
=========================
Ran: generic/002
Passed all 1 tests

root@demo-xfs-reflink /var/lib/xfstests # dmesg -c
[61392.525562] XFS (loop16): Mounting V5 Filesystem 20c3bdf7-69d8-42b2-8a7c-fccff0949dcc
[61392.536059] XFS (loop16): Ending clean mount
[61392.684417] run fstests fstestsstart/000 at 2024-02-16 17:51:28
[61392.726709] XFS (loop16): Unmounting Filesystem 20c3bdf7-69d8-42b2-8a7c-fccff0949dcc
[61392.779791] XFS (loop16): Mounting V5 Filesystem ce4188e8-8da8-474a-acb4-b1d8c3e7edb7
[61392.791217] XFS (loop16): Ending clean mount
[61393.328386] XFS (loop5): Mounting V5 Filesystem 6e3f6c64-5b48-41b6-8810-d2a41fbcd125
[61393.340019] XFS (loop5): Ending clean mount
[61393.347636] XFS (loop5): Unmounting Filesystem 6e3f6c64-5b48-41b6-8810-d2a41fbcd125
[61393.403519] XFS (loop16): Unmounting Filesystem ce4188e8-8da8-474a-acb4-b1d8c3e7edb7
[61393.456945] XFS (loop16): Mounting V5 Filesystem ce4188e8-8da8-474a-acb4-b1d8c3e7edb7
[61393.466506] XFS (loop16): Ending clean mount
[61393.504926] run fstests generic/002 at 2024-02-16 17:51:29
[61394.579638] XFS (loop16): Unmounting Filesystem ce4188e8-8da8-474a-acb4-b1d8c3e7edb7
[61394.637296] XFS (loop16): Mounting V5 Filesystem ce4188e8-8da8-474a-acb4-b1d8c3e7edb7
[61394.646365] XFS (loop16): Ending clean mount
[61394.762667] XFS (loop16): Unmounting Filesystem ce4188e8-8da8-474a-acb4-b1d8c3e7edb7
[61394.790581] run fstests fstestsdone/000 at 2024-02-16 17:51:30

Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
---
 check | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/check b/check
index 523cf024c139..b3fdda57f665 100755
--- a/check
+++ b/check
@@ -19,6 +19,7 @@ have_test_arg=false
 randomize=false
 exact_order=false
 start_after_test=""
+print_start_done=false
 list_group_tests=false
 export here=`pwd`
 xfile=""
@@ -84,6 +85,7 @@ check options
     --large-fs		optimise scratch device for large filesystems
     --list-group-tests	only list tests part of the groups you specified, do not run the tests
     --start-after	only start testing after the test specified
+    --print-start-done  append to /dev/kmsg when available test start and end time
     -s section		run only specified section from config file
     -S section		exclude the specified section from the config file
     -L <n>		loop tests <n> times following a failure, measuring aggregate pass/fail metrics
@@ -379,6 +381,11 @@ while [ $# -gt 0 ]; do
 	--list-group-tests)
 		list_group_tests=true
 		;;
+	--print-start-done)
+		if [ -w /dev/kmsg ]; then
+			print_start_done=true
+		fi
+		;;
 	-s)	RUN_SECTION="$RUN_SECTION $2"; shift ;;
 	-S)	EXCLUDE_SECTION="$EXCLUDE_SECTION $2"; shift ;;
 	-l)	diff="diff" ;;
@@ -1161,6 +1168,11 @@ function run_section()
 	_scratch_unmount 2> /dev/null
 }
 
+if $print_start_done; then
+	start_time=`date +"%F %T"`
+	echo "run fstests fstestsstart/000 at $start_time" > /dev/kmsg
+fi
+
 for ((iters = 0; iters < $iterations; iters++)) do
 	for section in $HOST_OPTIONS_SECTIONS; do
 		run_section $section
@@ -1172,6 +1184,11 @@ for ((iters = 0; iters < $iterations; iters++)) do
 	done
 done
 
+if $print_start_done; then
+	end_time=`date +"%F %T"`
+	echo "run fstests fstestsdone/000 at $end_time" > /dev/kmsg
+fi
+
 interrupt=false
 status=`expr $sum_bad != 0`
 exit
-- 
2.42.0





[Index of Archives]     [Linux Ext4 Filesystem]     [Union Filesystem]     [Filesystem Testing]     [Ceph Users]     [Ecryptfs]     [NTFS 3]     [AutoFS]     [Kernel Newbies]     [Share Photos]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux Cachefs]     [Reiser Filesystem]     [Linux RAID]     [NTFS 3]     [Samba]     [Device Mapper]     [CEPH Development]

  Powered by Linux