Christophe Varoqui wrote:
The current dm-multipath behaviour is currently a potent data corrupter on Persistant Reservation-based clusters sharing multipaths with the queue_if_no_path feature on (Clariion, Storageworks, ...). Consider the following scenario : - Node A take a write-exclusive persistent reservation on LU - Node B submits a write io to LU, which is a sda-sdb multipath - B dm_multipath routes the wio to sda, the wio is failed, the path is marked failed - B dm_multipath routes the wio to sdb, the wio is failed, the last path is marked failed - B queues the wio because of the queue_if_no_path feature. Process submitting the wio is stuck in D-state. - A releases the reservation. Queued wios are unqueued, corrupting the data on LU. I suspect wio returning a "reservation conflict" status should never be queued. DM suspend/resume on the multipath devmap effectively flushes the queue, but this solution leaves a window open for data corruption, between io enqueue and user-space driven queue flush. Is there work in progress to address this issue yet ? What's would be an acceptable solution design (for example Mike Christie suggested in Aug 2005 a scsi-to-blk error translation patch, which got nowhere) ?
y of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
If memory serves, a SCSI command status of RESERVATION CONFLICT did not find its way back to the sg driver API (and/or the command was retried). Is that still the case? Doug Gilbert -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html