persistent reservation behaviour with dm-multipath

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The current dm-multipath behaviour is currently a potent data corrupter
on PR-based clusters sharing multipaths with the queue_if_no_path
feature on. Consider the following scenario :

- Node A take a write-exclusive persistent reservation on LU
- Node B submits a write io to LU, which is a sda-sdb multipath
- B dm_multipath routes the wio to sda, the wio is failed, the path is
marked failed
- B dm_multipath routes the wio to sdb, the wio is failed, the last
path is marked failed
- B queues the wio because of the queue_if_no_path feature. Process
submitting the wio is stuck in D-state.
- A releases the reservation. Queued wios are unqueued, corrupting the
data on LU.

I suspect wio returning a "reservation conflict" status should never be
queued.

DM suspend/resume on the multipath effectively flushes the queue, but
this solution leaves a window open for data corruption, between io
enqueue and user-space driven queue flush.

I saw Mike's Aug 2005 patches for scsi errors translation in block-layer
errors, which were a usable infrastructure to implement the desired
behaviour. Is some variant of this work headed for the upstream kernel ?

Regards,
cvaroqui

--
dm-devel mailing list
dm-devel@xxxxxxxxxx
https://www.redhat.com/mailman/listinfo/dm-devel

[Index of Archives]     [DM Crypt]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite Discussion]     [KDE Users]     [Fedora Docs]

  Powered by Linux