CC+: dm-devel On Jun 13, 2024 / 23:10, hch@xxxxxxxxxxxxx wrote: > On Fri, Jun 14, 2024 at 06:03:36AM +0000, Shinichiro Kawasaki wrote: > > I would like to take a closer look in the failure, but I can not recreate it on > > my test systems. I wonder what is the difference between your test system and > > mine. I guess kernel config difference could be the difference. Could you share > > your kernel config? > > Attached. Thank you for the kernel config. I tried dm/002 run using the kernel config on several machines. One of the machines made dm/002 fail, but unfortunately, its failure symptom was different [1]. I will chase it as another failure. As for the failure you observe, I have one guess about the cause. According to the error log out put, it looks like the dd command did not cause error, even though the previous "dmsetup message dust1 0 enable" command enabled the I/O error on the dm-dust device. I looked in the dm-dust code, and found that the fail_read_on_bb flag of the struct dust_device is not guarded by any lock. So, I guess the fail_read_on_bb flag change by the "dmsetup .. enable" command on CPU1 did not take effect the dd on CPU2. IOW, it looks like a memory barrier issue for me. Based on this guess, I guess a change below may avoid the failure. Christoph, may I ask you to see if this change avoids the failure you observe? diff --git a/drivers/md/dm-dust.c b/drivers/md/dm-dust.c index 1a33820c9f46..da3ebdde287a 100644 --- a/drivers/md/dm-dust.c +++ b/drivers/md/dm-dust.c @@ -229,6 +229,7 @@ static int dust_map(struct dm_target *ti, struct bio *bio) bio_set_dev(bio, dd->dev->bdev); bio->bi_iter.bi_sector = dd->start + dm_target_offset(ti, bio->bi_iter.bi_sector); + smp_rmb(); if (bio_data_dir(bio) == READ) r = dust_map_read(dd, bio->bi_iter.bi_sector, dd->fail_read_on_bb); else @@ -433,10 +434,12 @@ static int dust_message(struct dm_target *ti, unsigned int argc, char **argv, } else if (!strcasecmp(argv[0], "disable")) { DMINFO("disabling read failures on bad sectors"); dd->fail_read_on_bb = false; + smp_wmb(); r = 0; } else if (!strcasecmp(argv[0], "enable")) { DMINFO("enabling read failures on bad sectors"); dd->fail_read_on_bb = true; + smp_wmb(); r = 0; } else if (!strcasecmp(argv[0], "countbadblocks")) { spin_lock_irqsave(&dd->dust_lock, flags); [1] dm/002 => nvme0n1 (dm-dust general functionality test) [failed] runtime 0.204s ... 0.174s --- tests/dm/002.out 2024-06-14 14:37:40.480794693 +0900 +++ /home/shin/Blktests/blktests/results/nvme0n1/dm/002.out.bad 2024-06-14 21:38:18.588976499 +0900 @@ -7,4 +7,6 @@ countbadblocks: 0 badblock(s) found countbadblocks: 3 badblock(s) found countbadblocks: 0 badblock(s) found +device-mapper: remove ioctl on dust1 failed: Device or resource busy +Command failed. Test complete modprobe: FATAL: Module dm_dust is in use.