This test is for PCI devices in a surprise remove capable slot and tests how well the drivers and kernel handle losing the link to that device. The test finds the PCI Express Capability register of the pci slot a block device is in, then at offset 0x10 (the Link Control Register) writes a 1 to bit 4 (Link Disable). This occurs unbeknownst to any of the drivers, just like a surprise removal. Drivers will find out about this through the pcie hotplug handler, at which point it's too late to communicate with the device, therfore testing how well we cope with the condition. The link is reenabled at the end of the test. Note, this is currently incompatible with NVMe Subsystems when CONFIG_NVME_MULTIPATH since the /dev/nvme*n* names don't have a pci parent in sysfs. Signed-off-by: Keith Busch <keith.busch@xxxxxxxxx> --- v1 -> v2: Incorporated feedback from Omar and Johannes Included the 016.out file Updated 'fio' parameters so we ignore the errors that will inevitably occur with this test so they don't muck with the output. Just an example of what to expect in a dmesg on a capable platform: [ 77.850030] run blktests block/016 at 2018-04-26 15:40:19 [ 82.890360] pciehp 0000:5d:00.0:pcie004: Slot(0-3): Link Down [ 82.911102] nvme2n1: detected capacity change from 400088457216 to 0 [ 82.911117] print_req_error: 67 callbacks suppressed [ 82.911120] print_req_error: I/O error, dev nvme2n1, sector 12481728 [ 82.911156] print_req_error: I/O error, dev nvme2n1, sector 315938512 [ 82.911163] print_req_error: I/O error, dev nvme2n1, sector 86586144 [ 82.911171] print_req_error: I/O error, dev nvme2n1, sector 297252456 [ 82.911175] print_req_error: I/O error, dev nvme2n1, sector 266779224 [ 82.911179] print_req_error: I/O error, dev nvme2n1, sector 57643584 [ 82.911182] print_req_error: I/O error, dev nvme2n1, sector 561615936 [ 82.911187] print_req_error: I/O error, dev nvme2n1, sector 199511192 [ 82.911191] print_req_error: I/O error, dev nvme2n1, sector 613858480 [ 82.911194] print_req_error: I/O error, dev nvme2n1, sector 2027136 [ 87.918781] pciehp 0000:5d:00.0:pcie004: Slot(0-3): Card not present [ 87.937184] pciehp 0000:5d:00.0:pcie004: Slot(0-3): Card present [ 87.952707] pciehp 0000:5d:00.0:pcie004: Slot(0-3): Link Up [ 88.064187] pci 0000:5e:00.0: [8086:0953] type 00 class 0x010802 [ 88.064216] pci 0000:5e:00.0: reg 0x10: [mem 0x00000000-0x00003fff 64bit] [ 88.064242] pci 0000:5e:00.0: reg 0x30: [mem 0x00000000-0x0000ffff pref] [ 88.064251] pci 0000:5e:00.0: enabling Extended Tags [ 88.064491] pci 0000:5e:00.0: BAR 6: assigned [mem 0xb8800000-0xb880ffff pref] [ 88.064495] pci 0000:5e:00.0: BAR 0: assigned [mem 0xb8810000-0xb8813fff 64bit] [ 88.064506] pcieport 0000:5d:00.0: PCI bridge to [bus 5e] [ 88.064510] pcieport 0000:5d:00.0: bridge window [io 0x8000-0x8fff] [ 88.064515] pcieport 0000:5d:00.0: bridge window [mem 0xb8800000-0xb89fffff] [ 88.064519] pcieport 0000:5d:00.0: bridge window [mem 0x38c001000000-0x38c002ffffff 64bit pref] [ 88.064987] nvme nvme2: pci function 0000:5e:00.0 [ 88.065060] nvme 0000:5e:00.0: enabling device (0100 -> 0102) common/rc | 19 +++++++++++++++++++ tests/block/016 | 54 +++++++++++++++++++++++++++++++++++++++++++++++++++++ tests/block/016.out | 2 ++ 3 files changed, 75 insertions(+) create mode 100755 tests/block/016 create mode 100644 tests/block/016.out diff --git a/common/rc b/common/rc index 1bd0374..8115b66 100644 --- a/common/rc +++ b/common/rc @@ -171,6 +171,25 @@ _get_pci_dev_from_blkdev() { tail -1 } +_get_pci_parent_from_blkdev() { + readlink -f "$TEST_DEV_SYSFS/device" | \ + grep -Eo '[0-9a-f]{4,5}:[0-9a-f]{2}:[0-9a-f]{2}\.[0-9a-f]' | \ + tail -2 | head -1 +} + +_test_dev_in_hotplug_slot() { + local parent + parent="$(_get_pci_parent_from_blkdev)" + + local slt_cap + slt_cap="$(setpci -s "${parent}" CAP_EXP+14.w)" + if [ $((0x${slt_cap} & 0x20)) -eq 0 ]; then + SKIP_REASON="$TEST_DEV is not in a hot pluggable slot" + return 1 + fi + return 0 +} + # Older versions of xfs_io use pwrite64 and such, so the error messages won't # match current versions of xfs_io. See c52086226bc6 ("filter: xfs_io output # has dropped "64" from error messages") in xfstests. diff --git a/tests/block/016 b/tests/block/016 new file mode 100755 index 0000000..0d54238 --- /dev/null +++ b/tests/block/016 @@ -0,0 +1,54 @@ +#!/bin/bash +# +# Do disable PCI device while doing I/O to it +# +# Copyright (C) 2018 Keith Busch <keith.busch@xxxxxxxxx> +# +# This program is free software: you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation, either version 3 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program. If not, see <http://www.gnu.org/licenses/>. + +DESCRIPTION="break PCI link device while doing I/O" +TIMED=1 + +requires() { + _have_fio && _have_program setpci +} + +device_requires() { + _test_dev_is_pci && _test_dev_in_hotplug_slot +} + +test_device() { + echo "Running ${TEST_NAME}" + + local parent + local TIMEOUT + + parent="$(_get_pci_parent_from_blkdev)" + + # start fio job + TIMEOUT=10 + _run_fio_rand_io --filename="$TEST_DEV" --time_based \ + --continue_on_error=io 2> /dev/null & + sleep 5 + + # masks the slot's link disable bit to 'on' + setpci -s "${parent}" CAP_EXP+10.w=10:10 + sleep 5 + + # masks the slot's link disable bit back to 'off' + setpci -s "${parent}" CAP_EXP+10.w=00:10 + sleep 5 + + echo "Test complete" +} diff --git a/tests/block/016.out b/tests/block/016.out new file mode 100644 index 0000000..e851e8e --- /dev/null +++ b/tests/block/016.out @@ -0,0 +1,2 @@ +Running block/016 +Test complete -- 2.14.3