[PATCH blktests] block/008: check CPU offline failure due to many IRQs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



When systems have more IRQs than a single CPU can handle, the test case
block/008 fails with kernel message such as,

   "CPU 31 has 111 vectors, 90 available. Cannot disable CPU"

The failure cause is that the test case offlined too many CPUs and the
left online CPU can not hold all of the required IRQ vectors. To avoid
this failure, check error message of CPU offline. If CPU offline failure
cause is IRQ vector resource shortage, do not handle it as a failure.
Also keep the actual number of CPUs which can be offlined without the
failure and use this number for the test.

Signed-off-by: Shin'ichiro Kawasaki <shinichiro.kawasaki@xxxxxxx>
---
 tests/block/008 | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

diff --git a/tests/block/008 b/tests/block/008
index 7445f8f..75aae65 100755
--- a/tests/block/008
+++ b/tests/block/008
@@ -60,17 +60,30 @@ test_device() {
 
 		if (( offlining )); then
 			idx=$((RANDOM % ${#online_cpus[@]}))
-			_offline_cpu "${online_cpus[$idx]}"
-			offline_cpus+=("${online_cpus[$idx]}")
-			unset "online_cpus[$idx]"
-			online_cpus=("${online_cpus[@]}")
-		else
+			if err=$(_offline_cpu "${online_cpus[$idx]}" 2>&1); then
+				offline_cpus+=("${online_cpus[$idx]}")
+				unset "online_cpus[$idx]"
+				online_cpus=("${online_cpus[@]}")
+			elif [[ $err =~ "No space left on device" ]]; then
+				# ENOSPC means CPU offline failure due to IRQ
+				# vector shortage. Keep current number of
+				# offline CPUs as maximum CPUs to offline.
+				max_offline=${#offline_cpus[@]}
+				offlining=0
+			else
+				echo "Failed to offline CPU: $err"
+				break
+			fi
+		fi
+
+		if (( !offlining )); then
 			idx=$((RANDOM % ${#offline_cpus[@]}))
 			_online_cpu "${offline_cpus[$idx]}"
 			online_cpus+=("${offline_cpus[$idx]}")
 			unset "offline_cpus[$idx]"
 			offline_cpus=("${offline_cpus[@]}")
 		fi
+
 		end_time=$(date +%s)
 		if (( end_time - start_time > timeout + 15 )); then
 			echo "fio did not finish after $timeout seconds!"
-- 
2.34.1




[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux