Re: blktests nvme/039 failure

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 4/10/23 4:49 AM, Shin'ichiro Kawasaki wrote:
Hello Alan,

I noticed that recently nvme/039 fails on my system occasionally (around 40%).
The failure messages are as follows:

nvme/039 => nvme0n1 (test error logging)                     [failed]
     runtime  0.176s  ...  0.167s
     --- tests/nvme/039.out      2023-04-06 10:11:07.925670528 +0900
     +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2023-04-10 20:15:07.679538017 +0900
     @@ -1,5 +1,2 @@
      Running nvme/039
     - Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
     - Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
     - Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
      Test complete

nvme/039 => nvme0n1 (test error logging)                     [failed]
     runtime  0.167s  ...  0.199s
     --- tests/nvme/039.out      2023-04-06 10:11:07.925670528 +0900
     +++ /home/shin/Blktests/blktests/results/nvme0n1/nvme/039.out.bad   2023-04-10 20:15:09.114539650 +0900
     @@ -1,5 +1,4 @@
      Running nvme/039
     - Read(0x2) @ LBA 0, 1 blocks, Unrecovered Read Error (sct 0x2 / sc 0x81) DNR
       Read(0x2) @ LBA 0, 1 blocks, Unknown (sct 0x3 / sc 0x75) DNR
       Write(0x1) @ LBA 0, 1 blocks, Write Fault (sct 0x2 / sc 0x80) DNR
      Test complete

It looks that expected error messages were not reported.

I suspect that the time duration is too short between error injection enable
and I/O to trigger the error. With the one line change below to add wait after
the error injection enable, the failures disappear. Do you think such wait is
the valid fix?

  tests/nvme/rc | 1 +
  1 file changed, 1 insertion(+)

diff --git a/tests/nvme/rc b/tests/nvme/rc
index 210a82a..7043c23 100644
--- a/tests/nvme/rc
+++ b/tests/nvme/rc
@@ -652,6 +652,7 @@ _nvme_enable_err_inject()
          echo "$4" > /sys/kernel/debug/"$1"/fault_inject/dont_retry
          echo "$5" > /sys/kernel/debug/"$1"/fault_inject/status
          echo "$6" > /sys/kernel/debug/"$1"/fault_inject/times
+	sleep 0.1
  }
_nvme_disable_err_inject()

I've been able to reproduce it.  The sleep .1 helps but doesn't eliminate the issue.  I did notice whenever there was a failure, there was also a "blk_print_req_error: 2 callbacks suppressed" in the log which would break the parsing the test needs to do.


Alan





[Index of Archives]     [Linux RAID]     [Linux SCSI]     [Linux ATA RAID]     [IDE]     [Linux Wireless]     [Linux Kernel]     [ATH6KL]     [Linux Bluetooth]     [Linux Netdev]     [Kernel Newbies]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Device Mapper]

  Powered by Linux