On Nov 04, 2024 / 11:29, Chaitanya Kulkarni wrote: > Add a test that randomly sets the cpumask from available CPUs for > the nvmet-wq while running the fio workload. This patch has been > tested on nvme-loop and nvme-tcp transport. > > Signed-off-by: Chaitanya Kulkarni <kch@xxxxxxxxxx> Thanks. As I noted in another mail, this test case generates a kernel INFO message, and it looks like catching a new bug. So I think this test case worth adding. Please find my review comments in line. > --- > tests/nvme/055 | 99 ++++++++++++++++++++++++++++++++++++++++++++++ > tests/nvme/055.out | 3 ++ > 2 files changed, 102 insertions(+) > create mode 100755 tests/nvme/055 > create mode 100644 tests/nvme/055.out > > diff --git a/tests/nvme/055 b/tests/nvme/055 > new file mode 100755 > index 0000000..9fe27a3 > --- /dev/null > +++ b/tests/nvme/055 > @@ -0,0 +1,99 @@ > +#!/bin/bash > +# SPDX-License-Identifier: GPL-2.0+ > +# Copyright (C) 2024 Chaitanya Kulkarni > +# > +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload > +# > + > +. tests/nvme/rc > + > +DESCRIPTION="Test nvmet-wq cpumask sysfs attribute with fio on NVMe-oF device" > +TIMED=1 > + > +requires() { > + _nvme_requires > + _have_fio && _have_loop Nit, a unneccesaary space. > + _require_nvme_trtype_is_fabrics > +} I suggest to add set_conditions() here as below. Without it, the test case is skipped when users do not se NVMET_TRTYPES variable. If we add set_conditions(), _have_loop call in requires() can be removed. set_conditions() { _set_nvme_trtype "$@" } > + > +cleanup_setup() { > + _nvme_disconnect_subsys > + _nvmet_target_cleanup > +} > + > +test() { > + local cpumask_path="/sys/devices/virtual/workqueue/nvmet-wq/cpumask" > + local original_cpumask > + local min_cpus > + local max_cpus > + local numbers > + local idx > + local ns > + > + echo "Running ${TEST_NAME}" > + > + _setup_nvmet > + _nvmet_target_setup > + _nvme_connect_subsys > + > + if [ ! -f "$cpumask_path" ]; then > + SKIP_REASONS+=("nvmet_wq cpumask sysfs attribute not found.") > + cleanup_setup > + return 1 > + fi > + > + ns=$(_find_nvme_ns "${def_subsys_uuid}") > + > + original_cpumask=$(cat "$cpumask_path") > + > + num_cpus=$(nproc) > + max_cpus=$(( num_cpus < 20 ? num_cpus : 20 )) > + min_cpus=0 > + #shellcheck disable=SC2207 > + numbers=($(seq $min_cpus $max_cpus)) Nit: the shellcheck error can be vaoided with this: read -a numbers -d '' < <(seq $min_cpus $max_cpus) > + > + _run_fio_rand_io --filename="/dev/${ns}" --time_based --runtime=130s \ The --time_based and --runtime=130s options are not required, because fio helper bash functions will add them. Instead, let's add this line before the fio command. : ${TIMEOUT:=60} The fio helper functions will refect this TIMEOUT value to the --runtime option. This will allow users to control runtime cost as TIMED=1 indicates. > + --iodepth=8 --size="${NVME_IMG_SIZE}" &> "$FULL" & > + > + # Let the fio settle down else we will break in the loop for fio check > + sleep 1 > + for ((i = 0; i < max_cpus; i++)); do > + if ! pgrep -x fio &> /dev/null ; then pgrep command is not in the GNU coreutils. I suggest to keep the fio process id in a variable and check with "kill -0" command. The test case block/005 does it. > + break > + fi > + > + if [[ ${#numbers[@]} -eq 0 ]]; then > + break > + fi > + > + idx=$((RANDOM % ${#numbers[@]})) > + > + #shellcheck disable=SC2004 > + cpu_mask=$(printf "%X" $((1 << ${numbers[idx]}))) > + echo "$cpu_mask" > "$cpumask_path" When I ran this test case, I observed an error at this line: echo: write error: Value too large for defined data type I think the cpu_mask calculation is wrong, and it should be like this: cpu_mask=0 for ((n = 0; n < numbers[idx]; n++)); do cpu_mask=$((cpu_mask + (1 << n))) done cpu_mask=$(printf "%X" $((cpu_mask))) > + if [[ $(cat "$cpumask_path") =~ ^[0,]*${cpu_mask}\n$ ]]; then > + echo "Test Failed: cpumask was not set correctly " > + echo "Expected ${cpu_mask} found $(cat "$cpumask_path")" > + cleanup_setup > + return 1 > + fi > + sleep 3 > + # Remove the selected number > + numbers=("${numbers[@]:0:$idx}" "${numbers[@]:$((idx + 1))}") > + done > + > + killall fio &> /dev/null killall is not in GNU coreutils either (blktests already uses it at other places though...) Does "kill -9" work instead? > + > + # Restore original cpumask > + echo "$original_cpumask" > "$cpumask_path" > + restored_cpumask=$(cat "$cpumask_path") > + > + if [[ "$restored_cpumask" != "$original_cpumask" ]]; then > + echo "Failed to restore original cpumask." > + cleanup_setup > + return 1 > + fi > + > + cleanup_setup > + echo "Test complete" > +} > diff --git a/tests/nvme/055.out b/tests/nvme/055.out > new file mode 100644 > index 0000000..427dfee > --- /dev/null > +++ b/tests/nvme/055.out > @@ -0,0 +1,3 @@ > +Running nvme/055 > +disconnected 1 controller(s) > +Test complete > -- > 2.40.0 >