On 08/04/2020 14:10, Daniel Wagner wrote:
Hi John,
On Wed, Apr 08, 2020 at 02:01:11PM +0100, John Garry wrote:
Is stress-cpu-hotplug an ltp test? or from Steven Rostedt - I saw some
threads where he mentioned some script?
My bad. It's the script from Steven, which toggle binary the cpus on/off:
[...]
2432 disabling cpu16 disabling cpu17 disabling cpu2
2433 disabling cpu1 enabling cpu16 enabling cpu17 enabling cpu2
2434 disabling cpu10 disabling cpu16 disabling cpu17 disabling cpu2
2435 enabling cpu1 enabling cpu10 enabling cpu16 enabling cpu17 enabling cpu2
2436 disabling cpu11 disabling cpu16 disabling cpu17 disabling cpu2
2437 disabling cpu1 enabling cpu11 enabling cpu16 enabling cpu17 enabling cpu2
2438 disabling cpu10 disabling cpu11 disabling cpu16 disabling cpu17 disabling cpu2
[..]
ok, but to really test this you need to ensure that all the cpus for a
managed interrupt affinity mask are offlined together for some period of
time greater than the IO timeout. Otherwise the hw queue's managed
interrupt would not be shut down, and you're not verifying that the
queues are fully drained.
Will the fio processes migrate back onto cpus which have been onlined again?
Hmm, good question. I've tried to assign them to a specific CPU via
--cpus_allowed_policy=split and --cpus_allowed.
fio --rw=randwrite --name=test --size=50M --iodepth=32 --direct=1 \
--bs=4k --numjobs=40 --time_based --runtime=1h --ioengine=libaio \
--group_reporting --cpus_allowed_policy=split --cpus_allowed=0-40
Though I haven't verified what happens when the CPU get's back online.
Maybe this will work since you're offlining patterns of cpus and the fio
processes have to migrate somewhere. But see above.
What is the block driver NVMe?
I've used a qla2xxx device. Hannes asked my to retry it with a megasas
device.
Thanks,
John