On Jun 24, 2024 / 04:46, hch@xxxxxxxxxxxxx wrote: > On Mon, Jun 24, 2024 at 11:21:33AM +0000, Shinichiro Kawasaki wrote: > > I took a look in the test script and dm-dust code again, and now I think the dd > > command is expected to success. The added bad blocks have default wr_fail_cnt > > value 0, then write error should not happen for the dd command. (Bryan, if this > > understanding is wrong, please let me know.) > > > > So the error log that Christoph observes indicates that the dd command failed, > > and this failure is unexpected. I can not think of any cause of the failure. > > Yes, it does indeed fail, this is 002.full with your patch: > > dd: error writing '/dev/mapper/dust1': Invalid argument > 1+0 records in > 0+0 records out > 0 bytes copied, 0.000373943 s, 0.0 kB/s Thank you for sharing the logs. The dd failed at the very first block. > > > > > Christoph, may I ask you to share the kernel messages during the test run? > > Also, I would like to check the dd command output. The one liner patch below > > to the blktests will create resutls/vdb/dm/002.full with the dd output. > > the relevant lines of dmesg output below: > > [ 57.773967] run blktests dm/002 at 2024-06-24 11:43:53 > [ 57.791251] I/O error, dev vdb, sector 774 op 0x0:(READ) flags 0x80700 phys_seg 250 prio class 0 > [ 57.791849] I/O error, dev vdb, sector 520 op 0x0:(READ) flags 0x84700 phys_seg 254 prio class 0 > [ 57.792420] I/O error, dev vdb, sector 520 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.792805] I/O error, dev vdb, sector 521 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.793190] I/O error, dev vdb, sector 522 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.793578] I/O error, dev vdb, sector 523 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.793955] I/O error, dev vdb, sector 524 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.794318] I/O error, dev vdb, sector 525 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.794700] I/O error, dev vdb, sector 526 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.795130] I/O error, dev vdb, sector 527 op 0x0:(READ) flags 0x0 phys_seg 1 prio class 0 > [ 57.795516] Buffer I/O error on dev dm-0, logical block 65, async page read > [ 57.800743] device-mapper: dust: dust_add_block: badblock added at block 60 with write fail count 0 > [ 57.802587] device-mapper: dust: dust_add_block: badblock added at block 67 with write fail count 0 > [ 57.804359] device-mapper: dust: dust_add_block: badblock added at block 72 with write fail count 0 > [ 57.811253] device-mapper: dust: dust_add_block: badblock added at block 60 with write fail count 0 > [ 57.813065] device-mapper: dust: dust_add_block: badblock added at block 67 with write fail count 0 > [ 57.814786] device-mapper: dust: dust_add_block: badblock added at block 72 with write fail count 0 > [ 57.818023] device-mapper: dust: enabling read failures on bad sectors > [ 57.826500] Buffer I/O error on dev dm-0, logical block 8, async page read Hmm, there are many I/O errors for /dev/vdb and /dev/mapper/dust1. When the test case succeeds, no I/O error is reported [2]. So, the next question is "why the I/O errors happen?" My mere guess is that the TEST_DEV might have non-512 block size, probably 4096. The test case dm/002 specifies 512 byte as the block size for the /dev/mapper/dust1, then 4096 block size will cause I/O errors due to the unaligned block sizes. I ran dm/002 with a null_blk with 4096 block size, and observed very similar symptom as what Christoph observed. Christoph, what is the output of "blockdev --getbsz /dev/vdb" on your system? If it is not 512, could you try blktests side fix patch candidate [3]? [2] Kernel message on success case: [17024.868423] [ T160674] run blktests dm/002 at 2024-06-24 20:07:17 [17024.983252] [ T160697] device-mapper: dust: dust_add_block: badblock added at block 60 with write fail c ount 0 [17024.992010] [ T160698] device-mapper: dust: dust_add_block: badblock added at block 67 with write fail c ount 0 [17024.999706] [ T160699] device-mapper: dust: dust_add_block: badblock added at block 72 with write fail c ount 0 [17025.029553] [ T160704] device-mapper: dust: dust_add_block: badblock added at block 60 with write fail c ount 0 [17025.036658] [ T160705] device-mapper: dust: dust_add_block: badblock added at block 67 with write fail c ount 0 [17025.044143] [ T160706] device-mapper: dust: dust_add_block: badblock added at block 72 with write fail c ount 0 [17025.058206] [ T160708] device-mapper: dust: enabling read failures on bad sectors [17025.067598] [ T160709] device-mapper: dust: block 60 removed from badblocklist by write [17025.069738] [ T160709] device-mapper: dust: block 67 removed from badblocklist by write [17025.071753] [ T160709] device-mapper: dust: block 72 removed from badblocklist by write [3] diff --git a/tests/dm/002 b/tests/dm/002 index 6635c43..f60e3bf 100755 --- a/tests/dm/002 +++ b/tests/dm/002 @@ -14,10 +14,12 @@ requires() { test_device() { + local sz bsz echo "Running ${TEST_NAME}" - TEST_DEV_SZ=$(blockdev --getsz "$TEST_DEV") - dmsetup create dust1 --table "0 $TEST_DEV_SZ dust $TEST_DEV 0 512" + sz=$(blockdev --getsz "$TEST_DEV") + bsz=$(blockdev --getbsz "$TEST_DEV") + dmsetup create dust1 --table "0 $sz dust $TEST_DEV 0 $bsz" dmsetup message dust1 0 addbadblock 60 dmsetup message dust1 0 addbadblock 67 dmsetup message dust1 0 addbadblock 72 @@ -30,7 +32,7 @@ test_device() { dmsetup message dust1 0 addbadblock 72 dmsetup message dust1 0 countbadblocks dmsetup message dust1 0 enable - dd if=/dev/zero of=/dev/mapper/dust1 bs=512 count=128 oflag=direct >/dev/null 2>&1 || return $? + dd if=/dev/zero of=/dev/mapper/dust1 bs=$bsz count=128 oflag=direct >$FULL 2>&1 || return $? sync dmsetup message dust1 0 countbadblocks sync