Hi, Zdenek Kabelac wrote: > So as guessed earlier - unrelated to lvm2. Yes. And thank you for pointing me to fsfreeze. > You likely need to discover what is wrong with your 'raid' device ? > Was your raid array fully synchronized ? > > Do you have only problem with one particular MD 'raid' on your system - or > any other 'raid' you attach/create will suffer the same problem ? > > Is it 'nvme' related on your system ? > > Are the 'individual' nvme devices running fine - just when they are mixed > together into a single array you get these 'fsfreeze' troubles ? I think it is unrelated to the mdraid because I was able to reproduce the problem with a single nvme device which I removed from the raid array. However, I tried different kernel versions: - Brand new 5.19 is showing same problem - 5.16 was the last working kernel So I run bisect which revealed > md: add support for REQ_NOWAIT > > commit 021a24460dc2 ("block: add QUEUE_FLAG_NOWAIT") added support > for checking whether a given bdev supports handling of REQ_NOWAIT or not. > Since then commit 6abc49468eea ("dm: add support for REQ_NOWAIT and enable > it for linear target") added support for REQ_NOWAIT for dm. This uses > a similar approach to incorporate REQ_NOWAIT for md based bios. > > This patch was tested using t/io_uring tool within FIO. A nvme drive > was partitioned into 2 partitions and a simple raid 0 configuration > /dev/md0 was created. > > md0 : active raid0 nvme4n1p1[1] nvme4n1p2[0] > 937423872 blocks super 1.2 512k chunks > > Before patch: > > $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100 > > Running top while the above runs: > > $ ps -eL | grep $(pidof io_uring) > > 38396 38396 pts/2 00:00:00 io_uring > 38396 38397 pts/2 00:00:15 io_uring > 38396 38398 pts/2 00:00:13 iou-wrk-38397 > > We can see iou-wrk-38397 io worker thread created which gets created > when io_uring sees that the underlying device (/dev/md0 in this case) > doesn't support nowait. > > After patch: > > $ ./t/io_uring /dev/md0 -p 0 -a 0 -d 1 -r 100 > > Running top while the above runs: > > $ ps -eL | grep $(pidof io_uring) > > 38341 38341 pts/2 00:10:22 io_uring > 38341 38342 pts/2 00:10:37 io_uring > > After running this patch, we don't see any io worker thread > being created which indicated that io_uring saw that the > underlying device does support nowait. This is the exact behaviour > noticed on a dm device which also supports nowait. > > For all the other raid personalities except raid0, we would need > to train pieces which involves make_request fn in order for them > to correctly handle REQ_NOWAIT. https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=f51d46d0e7cb5b8494aa534d276a9d8915a2443d as bad commit. Building latest kernel with this commit reverted (and reverting follow up fix 0f9650bd838efe5c52f7e5f40c3204ad59f1964d, too) fixes the problem for me. What I do _not_ understand yet: It's a change in md driver -- how could that change affect the single device I pulled off the array? However, during bisect I only tested against the mdraid array. Maybe there is another "nowait" issue or a specific problem with "REQ_NOWAIT" and the Dell OEM nvme devices... I'll post to LKML shortly, thanks! -- Regards, Thomas _______________________________________________ linux-lvm mailing list linux-lvm@xxxxxxxxxx https://listman.redhat.com/mailman/listinfo/linux-lvm read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/