On Tue, 1 Nov 2022 11:22:21 -0600 Keith Busch <kbusch@xxxxxxxxxx> wrote: > On Tue, Nov 01, 2022 at 12:15:58AM +0300, Dmitrii Tcvetkov wrote: > > > > # cat /proc/7906/stack > > [<0>] submit_bio_wait+0xdb/0x140 > > [<0>] blkdev_direct_IO+0x62f/0x770 > > [<0>] blkdev_read_iter+0xc1/0x140 > > [<0>] vfs_read+0x34e/0x3c0 > > [<0>] __x64_sys_pread64+0x74/0xc0 > > [<0>] do_syscall_64+0x6a/0x90 > > [<0>] entry_SYSCALL_64_after_hwframe+0x4b/0xb5 > > > > After "mdadm --fail" invocation the last line becomes: > > [pid 7906] pread64(13, 0x627c34c8d200, 4096, 0) = -1 EIO > > (Input/output error) > > It looks like something isn't accounting for the IO size correctly > when there's an offset. It may be something specific to one of the > stacking drivers in your block setup. Does this still happen without > the cryptosetup step? > I created setup lvm(mdraid(gpt(HDD))): # lsblk -t -a NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME ... sdd 0 512 0 512 512 1 bfq 64 128 0B ├─sdd3 0 512 0 512 512 1 bfq 64 128 0B │ └─md1 0 512 0 512 512 1 128 128 0B │ ├─512lvmraid-zfs 0 512 0 512 512 1 128 128 0B │ └─512lvmraid-wrk 0 512 0 512 512 1 128 128 0B sde 0 512 0 512 512 1 bfq 64 128 0B ├─sde3 0 512 0 512 512 1 bfq 64 128 0B │ └─md1 0 512 0 512 512 1 128 128 0B │ ├─512lvmraid-zfs 0 512 0 512 512 1 128 128 0B │ └─512lvmraid-wrk 0 512 0 512 512 1 128 128 0B where: # mdadm --create --level=1 --metadata=1.2 \ --raid-devices=2 /dev/md1 /dev/sdd3 /dev/sde3 # pvcreate /dev/md1 # vgcreate 512lvmraid /dev/md2 In this case problem doesn't reproduce, both guests start successfully. It also doesn't reproduce with 4096 sector loop: # lsblk -t -a NAME ALIGNMENT MIN-IO OPT-IO PHY-SEC LOG-SEC ROTA SCHED RQ-SIZE RA WSAME loop0 0 4096 0 4096 4096 0 none 128 128 0B └─md2 0 4096 0 4096 4096 0 128 128 0B ├─4096lvmraid-zfs 0 4096 0 4096 4096 0 128 128 0B └─4096lvmraid-wrk 0 4096 0 4096 4096 0 128 128 0B loop1 0 4096 0 4096 4096 0 none 128 128 0B └─md2 0 4096 0 4096 4096 0 128 128 0B ├─4096lvmraid-zfs 0 4096 0 4096 4096 0 128 128 0B └─4096lvmraid-wrk 0 4096 0 4096 4096 0 128 128 0B where: # losetup --sector-size 4096 -f /dev/sdd4 # losetup --sector-size 4096 -f /dev/sde4 # mdadm --create --level=1 --metadata=1.2 \ --raid-devices=2 /dev/md2 /dev/loop0 /dev/loop1 # pvcreate /dev/md2 # vgcreate 4096lvmraid /dev/md2 Indeed then something is wrong in LUKS. > For a different experiment, it may be safer to just force all > alignment for stacking drivers. Could you try the following and see > if that gets it working again? > > --- > diff --git a/block/blk-settings.c b/block/blk-settings.c > index 8bb9eef5310e..5c16fdb00c6f 100644 > --- a/block/blk-settings.c > +++ b/block/blk-settings.c > @@ -646,6 +646,7 @@ int blk_stack_limits(struct queue_limits *t, > struct queue_limits *b, t->misaligned = 1; > ret = -1; > } > + blk_queue_dma_alignment(t, t->logical_block_size - 1); > > t->max_sectors = blk_round_down_sectors(t->max_sectors, > t->logical_block_size); t->max_hw_sectors = > blk_round_down_sectors(t->max_hw_sectors, t->logical_block_size); -- This doesn't compile: CC block/blk-settings.o block/blk-settings.c: In function ‘blk_stack_limits’: block/blk-settings.c:649:33: error: passing argument 1 of ‘blk_queue_dma_alignment’ from incompatible pointer type [-Werror=incompatible-pointer-types] 649 | blk_queue_dma_alignment(t, t->logical_block_size - 1); | ^ | | | struct queue_limits * In file included from block/blk-settings.c:9: ./include/linux/blkdev.h:956:37: note: expected ‘struct request_queue *’ but argument is of type ‘struct queue_limits *’ 956 | extern void blk_queue_dma_alignment(struct request_queue *, int); I didn't find obvious way to get a request_queue pointer, which corresponds to struct queue_limits *t.