[PATCH kdevops] fstests: provide kconfig guidance for SOAK_DURATION

[Date Prev] [Date Next] [Thread Prev] [Thread Next] [Date Index] [Thread Index]



The kdevops test runner has supported a custom SOAK_DURATION for
fstests, however we were not providing any guidance. This means folks
likely disable this. Throw a bone and provide some basic guidance and
use 2.5 hours as the default value. There are about 46 tests today
which use soak duration, this means if you are testing serially it
increase total test time by about 5 days than the previously known
total test time.

Note that if you are using kernel-ci and using a max loop goal of 100
that means 500 days extra, so about 1.3 years extra total test time.
If enabling soak duration you may want to then re-evaluate your loop
target goal for kernel-ci for kdevops.

Signed-off-by: Luis Chamberlain <mcgrof@xxxxxxxxxx>
---

Chandan, Amir, lemme know what you think of a default 2.5 hours default
if soak duration is enabled. The only thing is the math indicates that
if you are going to enable kernel-ci we won't finish this year.

To be clear, we've picked up testing with soak duration seriously for
our LBS testing. It is why we've been able to find pretty hard to
reproduce issues even on the page cache for the baseline [0], ie, without
LBS. While folks have seemed to have found value in adopting 2.5 hours
and of the results we have found, it obviously means a scaling issue
to consider to decide when we're done with testing our baseline.

At first I wrote this patch just to provide basic guidance for kdevops,
but after doing a bit of the math on how it also extends total test
time, *with* our kernel-ci effort, it reveals clearly we should probably
reconsider lowering the kernel-ci threshold a bit if adopting soak
duration.

CC'ing a bit wider audience so to get a bit better idea of what folks
might consider a sensible value for your own testing too. From what
we've been observing, SOAK_DURATION allows us to catch bugs faster than
just increasing the kernel-ci count, however, using both let's us catch
even more bugs too.

To help *reduce* the amount of time to test we've deployed many kdevops
XFS clusters to help test the baseline. This is why our count time on
kernel-ci no is about 50-60 with a soak duration of about 2.5 hours.

Also please not that the reported bugs so far are the ones with crashes,
there are other failures too, but we just haven't had the time to disect
and report failures which are non-fatal (crashes) as crashes have been
our priority.

[0] https://github.com/linux-kdevops/kdevops/blob/master/docs/xfs-bugs.md

 playbooks/roles/fstests/defaults/main.yml |  3 +
 workflows/fstests/Kconfig                 | 89 ++++++++++++++++++++---
 workflows/fstests/Makefile.sparsefiles    |  4 +
 3 files changed, 87 insertions(+), 9 deletions(-)

diff --git a/playbooks/roles/fstests/defaults/main.yml b/playbooks/roles/fstests/defaults/main.yml
index 2f70f9549cde..4a1f5dec5827 100644
--- a/playbooks/roles/fstests/defaults/main.yml
+++ b/playbooks/roles/fstests/defaults/main.yml
@@ -30,6 +30,9 @@ fstests_test_logdev_mkfs_opts: "/dev/null"
 fstests_test_dev_zns: "/dev/null"
 fstests_zns_enabled: False
 
+fstests_soak_duration_enable: False
+fstests_soak_duration: 0
+
 fstests_uses_no_devices: False
 fstests_generate_simple_config_enable: False
 fstests_generate_nvme_live_config_enable: False
diff --git a/workflows/fstests/Kconfig b/workflows/fstests/Kconfig
index 985a7847b6c7..bbd8927b3cd3 100644
--- a/workflows/fstests/Kconfig
+++ b/workflows/fstests/Kconfig
@@ -760,15 +760,23 @@ config FSTESTS_RUN_LARGE_DISK_TESTS
 	  to run. The "large disk" requirement is test dependent, but
 	  typically, it means a disk with capacity of at several 10G.
 
-config FSTESTS_SOAK_DURATION
-	int "Custom Soak duration to be used"
-	default 0
+config FSTESTS_ENABLE_SOAK_DURATION
+	bool "Enable custom soak duration time"
 	help
-	  Custom Soak duration to be used during test execution. If you set this
-	  to a non-zero value then fstests will increase the amount of time it
-	  takes to run certain tests which are time based and support using
-	  SOAK_DURATION. A moderate high value setting for this is 9900 which is
-	  2.5 hours.
+	  Enable soak duration to be used during test execution. If you are not
+	  interested in extending your testing then leave this disabled.
+
+	  Using a custom soak duration to a non-zero value then fstests will
+	  increase the amount of time it takes to run certain tests which are
+	  time based and support using SOAK_DURATION. A moderate high value
+	  setting for this is 9900 which is 2.5 hours.
+
+	  Note that we have 46 tests today which will be able to use soak
+	  duration if set. This means your test time will increase by the
+	  soak duration * these number of tests. When soak duration is
+	  enabled the test specific watchdog fstests_watchdog.py will be
+	  aware of tests which require soak duration and consider before
+	  reporting a possible hang.
 
 	  As of 2023-10-31 that consists of the following tests which use either
 	  fsstress or fsx or fio. Tests either use SOAK_DURATION directly or they
@@ -786,7 +794,7 @@ config FSTESTS_SOAK_DURATION
 	  - generic/648 - fsstress + disk failures on loopback
 	  - generic/650 - fsstress - multithreaded write + CPU hotplug
 
-	  The tests below use _scratch_xfs_stress_scrub() to stress
+	  All the tests below use _scratch_xfs_stress_scrub() to stress
 	  test an with fsstress with scrub or an alternate xfs_db operation.
 
 	  - xfs/285
@@ -825,4 +833,67 @@ config FSTESTS_SOAK_DURATION
 	  - xfs/729
 	  - xfs/800
 
+if FSTESTS_ENABLE_SOAK_DURATION
+
+choice
+	prompt "Soak duration value to use"
+	default FSTESTS_SOAK_DURATION_HIGH
+
+config FSTESTS_SOAK_DURATION_CUSTOM
+	bool "Custom"
+	help
+	  You want to specify the value yourself.
+
+config FSTESTS_SOAK_DURATION_PATHALOGICAL
+	bool "High (48 hours)"
+	help
+	  Use 48 hours for soak duration.
+
+	  Using this with 46 tests known to use soak duration means your test
+	  time will increase by about 92 days, or a bit over 3 months if run
+	  serially.
+
+config FSTESTS_SOAK_DURATION_HIGH
+	bool "High (2.5 hours)"
+	help
+	  Use 2.5 hours for soak duration.
+
+	  Using this with 46 tests known to use soak duration means your test
+	  time will increase by about 5 days if run serially.
+
+config FSTESTS_SOAK_DURATION_MID
+	bool "Mid (1 hour)"
+	help
+	  Use 1 hour for soak duration.
+
+	  Using this with 46 tests known to use soak duration means your test
+	  time will increase by about 2 days if run serially.
+
+config FSTESTS_SOAK_DURATION_LOW
+	bool "Low (30 minutes)
+	help
+	  Use 30 minutes for soak duration.
+
+	  Using this with 46 tests known to use soak duration means your test
+	  time will increase by about 1 day if run serially.
+
+endchoice
+
+config FSTESTS_SOAK_DURATION_CUSTOM_VAL
+	int "Custom soak duration value (seconds)"
+	default 0
+	depends on FSTESTS_SOAK_DURATION_CUSTOM
+	help
+	  Enter your custom soak duration value in seconds.
+
+endif # FSTESTS_ENABLE_SOAK_DURATION
+
+config FSTESTS_SOAK_DURATION
+	default 0 if !FSTESTS_ENABLE_SOAK_DURATION
+	default FSTESTS_SOAK_DURATION_CUSTOM_VAL if FSTESTS_SOAK_DURATION_CUSTOM
+	default 1800 if FSTESTS_SOAK_DURATION_LOW
+	default 3600 if FSTESTS_SOAK_DURATION_MID
+	default 9900 if FSTESTS_SOAK_DURATION_HIGH
+	default 172800 if FSTESTS_SOAK_DURATION_PATHALOGICAL
+
 endif # KDEVOPS_WORKFLOW_ENABLE_FSTESTS
diff --git a/workflows/fstests/Makefile.sparsefiles b/workflows/fstests/Makefile.sparsefiles
index c5ca20a9c462..7dd129c4f9cc 100644
--- a/workflows/fstests/Makefile.sparsefiles
+++ b/workflows/fstests/Makefile.sparsefiles
@@ -44,6 +44,10 @@ FSTESTS_ARGS += run_large_disk_tests='$(FSTESTS_RUN_LARGE_DISK_TESTS)'
 FSTESTS_ARGS += run_auto_group_tests='$(FSTESTS_RUN_AUTO_GROUP_TESTS)'
 FSTESTS_ARGS += run_custom_group_tests='$(FSTESTS_RUN_CUSTOM_GROUP_TESTS)'
 FSTESTS_ARGS += exclude_test_groups='$(CONFIG_FSTESTS_EXCLUDE_TEST_GROUPS)'
+
+ifeq (y,$(CONFIG_FSTESTS_ENABLE_SOAK_DURATION))
+FSTESTS_ARGS += fstests_soak_duration_enable='True'
+endif
 FSTESTS_ARGS += fstests_soak_duration='$(CONFIG_FSTESTS_SOAK_DURATION)'
 
 ifeq (y,$(CONFIG_FSTESTS_ENABLE_RUN_CUSTOM_TESTS))
-- 
2.42.0





[Index of Archives]     [Linux Filesystems Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux