persistent reservation behaviour with dm-multipath

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



The current dm-multipath behaviour is currently a potent data corrupter
on Persistant Reservation-based clusters sharing multipaths with the
queue_if_no_path feature on (Clariion, Storageworks, ...).

Consider the following scenario :

- Node A take a write-exclusive persistent reservation on LU
- Node B submits a write io to LU, which is a sda-sdb multipath
- B dm_multipath routes the wio to sda, the wio is failed, the path is
marked failed
- B dm_multipath routes the wio to sdb, the wio is failed, the last
path is marked failed
- B queues the wio because of the queue_if_no_path feature. Process
submitting the wio is stuck in D-state.
- A releases the reservation. Queued wios are unqueued, corrupting the
data on LU.

I suspect wio returning a "reservation conflict" status should never be
queued.

DM suspend/resume on the multipath devmap effectively flushes the queue,
but this solution leaves a window open for data corruption, between io
enqueue and user-space driven queue flush.

Is there work in progress to address this issue yet ? What's would be an
acceptable solution design (for example Mike Christie suggested in Aug
2005 a scsi-to-blk error translation patch, which got nowhere) ?

Regards,
cvaroqui

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [SCSI Target Devel]     [Linux SCSI Target Infrastructure]     [Kernel Newbies]     [IDE]     [Security]     [Git]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux ATA RAID]     [Linux IIO]     [Samba]     [Device Mapper]
  Powered by Linux