The current dm-multipath behaviour is currently a potent data corrupter on PR-based clusters sharing multipaths with the queue_if_no_path feature on. Consider the following scenario : - Node A take a write-exclusive persistent reservation on LU - Node B submits a write io to LU, which is a sda-sdb multipath - B dm_multipath routes the wio to sda, the wio is failed, the path is marked failed - B dm_multipath routes the wio to sdb, the wio is failed, the last path is marked failed - B queues the wio because of the queue_if_no_path feature. Process submitting the wio is stuck in D-state. - A releases the reservation. Queued wios are unqueued, corrupting the data on LU. I suspect wio returning a "reservation conflict" status should never be queued. DM suspend/resume on the multipath effectively flushes the queue, but this solution leaves a window open for data corruption, between io enqueue and user-space driven queue flush. I saw Mike's Aug 2005 patches for scsi errors translation in block-layer errors, which were a usable infrastructure to implement the desired behaviour. Is some variant of this work headed for the upstream kernel ? Regards, cvaroqui -- dm-devel mailing list dm-devel@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/dm-devel