On Thu, 2024-04-25 at 19:35 -0400, Benjamin Marzinski wrote: > > 1. create a multipath device with a kpartx partition on top of it and > no_path_retry set to either "queue" or something long enough to run > all > the commands in the reproducer before it disables queueing. > 2. disable all the paths to the device with something like: > # echo offline > /sys/block/<path_dev>/device/state > 3. Write directly to the multipath device with something like: > # dd if=/dev/zero of=/dev/mapper/<mpath_dev> bs=4K count=1 > 4. delete all the paths to the device with something like: > # echo 1 > /sys/block/<path_dev>/device/delete I've tried to reproduce the issue with these commands. Test system was using a LIO iSCSI target with 2 paths. I created a test script (attached) to try the offline / IO / delete procedure repeatedly. I haven't been able to make multipathd hang even once. I also played around with dd options. If I use oflag=sync or oflag=direct, the dd command itself hangs. Did I set up anything wrongly, or does the behavior perhaps depend on the kernel, or something else perhaps? Mine was a 6.4 kernel. This is not to say there's something wrong with your patch, but I'd like to understand the error situation better, as it doesn't seem to be trigger-able on my test system. multipath.conf: defaults { verbosity 3 flush_on_last_del yes } blacklist { wwid QEMU } overrides { no_path_retry queue } Regards, Martin
Attachment:
flush-0-paths.sh
Description: application/shellscript