On 11/7/24 21:21, Shinichiro Kawasaki wrote: > On Nov 04, 2024 / 11:29, Chaitanya Kulkarni wrote: >> Add a test that randomly sets the cpumask from available CPUs for >> the nvmet-wq while running the fio workload. This patch has been >> tested on nvme-loop and nvme-tcp transport. >> >> Signed-off-by: Chaitanya Kulkarni <kch@xxxxxxxxxx> > > Thanks. As I noted in another mail, this test case generates a kernel > INFO message, and it looks like catching a new bug. So I think this > test case worth adding. Please find my review comments in line. > >> --- >> tests/nvme/055 | 99 ++++++++++++++++++++++++++++++++++++++++++++++ >> tests/nvme/055.out | 3 ++ >> 2 files changed, 102 insertions(+) >> create mode 100755 tests/nvme/055 >> create mode 100644 tests/nvme/055.out >> >> diff --git a/tests/nvme/055 b/tests/nvme/055 >> new file mode 100755 >> index 0000000..9fe27a3 >> --- /dev/null >> +++ b/tests/nvme/055 >> @@ -0,0 +1,99 @@ >> +#!/bin/bash >> +# SPDX-License-Identifier: GPL-2.0+ >> +# Copyright (C) 2024 Chaitanya Kulkarni >> +# >> +# Test nvmet-wq cpumask sysfs attribute with NVMe-oF and fio workload >> +# >> + >> +. tests/nvme/rc >> + >> +DESCRIPTION="Test nvmet-wq cpumask sysfs attribute with fio on NVMe-oF device" >> +TIMED=1 >> + >> +requires() { >> + _nvme_requires >> + _have_fio && _have_loop > > Nit, a unneccesaary space. > >> + _require_nvme_trtype_is_fabrics >> +} > done. > I suggest to add set_conditions() here as below. Without it, the test case is > skipped when users do not se NVMET_TRTYPES variable. If we add set_conditions(), > _have_loop call in requires() can be removed. > > set_conditions() { > _set_nvme_trtype "$@" > } > done. >> + >> +cleanup_setup() { >> + _nvme_disconnect_subsys >> + _nvmet_target_cleanup >> +} >> + >> +test() { >> + local cpumask_path="/sys/devices/virtual/workqueue/nvmet-wq/cpumask" >> + local original_cpumask >> + local min_cpus >> + local max_cpus >> + local numbers >> + local idx >> + local ns >> + >> + echo "Running ${TEST_NAME}" >> + >> + _setup_nvmet >> + _nvmet_target_setup >> + _nvme_connect_subsys >> + >> + if [ ! -f "$cpumask_path" ]; then >> + SKIP_REASONS+=("nvmet_wq cpumask sysfs attribute not found.") >> + cleanup_setup >> + return 1 >> + fi >> + >> + ns=$(_find_nvme_ns "${def_subsys_uuid}") >> + >> + original_cpumask=$(cat "$cpumask_path") >> + >> + num_cpus=$(nproc) >> + max_cpus=$(( num_cpus < 20 ? num_cpus : 20 )) >> + min_cpus=0 >> + #shellcheck disable=SC2207 >> + numbers=($(seq $min_cpus $max_cpus)) > > Nit: the shellcheck error can be vaoided with this: > > read -a numbers -d '' < <(seq $min_cpus $max_cpus) > done. >> + >> + _run_fio_rand_io --filename="/dev/${ns}" --time_based --runtime=130s \ > > The --time_based and --runtime=130s options are not required, because fio helper > bash functions will add them. > > Instead, let's add this line before the fio command. > > : ${TIMEOUT:=60} > > The fio helper functions will refect this TIMEOUT value to the --runtime > option. This will allow users to control runtime cost as TIMED=1 indicates. > done. >> + --iodepth=8 --size="${NVME_IMG_SIZE}" &> "$FULL" & >> + >> + # Let the fio settle down else we will break in the loop for fio check >> + sleep 1 >> + for ((i = 0; i < max_cpus; i++)); do >> + if ! pgrep -x fio &> /dev/null ; then > > pgrep command is not in the GNU coreutils. I suggest to keep the fio process id > in a variable and check with "kill -0" command. The test case block/005 does it. > thanks for the pointer, done and it looks much cleaner. >> + break >> + fi >> + >> + if [[ ${#numbers[@]} -eq 0 ]]; then >> + break >> + fi >> + >> + idx=$((RANDOM % ${#numbers[@]})) >> + >> + #shellcheck disable=SC2004 >> + cpu_mask=$(printf "%X" $((1 << ${numbers[idx]}))) >> + echo "$cpu_mask" > "$cpumask_path" > > When I ran this test case, I observed an error at this line: > > echo: write error: Value too large for defined data type > > I think the cpu_mask calculation is wrong, and it should be like this: > > cpu_mask=0 > for ((n = 0; n < numbers[idx]; n++)); do > cpu_mask=$((cpu_mask + (1 << n))) > done > cpu_mask=$(printf "%X" $((cpu_mask))) > > didn't see that error in my execution done. This version tries to only use one CPU at a time, but with your suggestion it will now consider more than one CPUs which I think is useful too. I modified the calculation to keep it simple for now. >> + if [[ $(cat "$cpumask_path") =~ ^[0,]*${cpu_mask}\n$ ]]; then >> + echo "Test Failed: cpumask was not set correctly " >> + echo "Expected ${cpu_mask} found $(cat "$cpumask_path")" >> + cleanup_setup >> + return 1 >> + fi >> + sleep 3 >> + # Remove the selected number >> + numbers=("${numbers[@]:0:$idx}" "${numbers[@]:$((idx + 1))}") >> + done >> + >> + killall fio &> /dev/null > > killall is not in GNU coreutils either (blktests already uses it at other > places though...) Does "kill -9" work instead? > done. Thanks for the review comments, I'll send it V2 soon. -ck