Am 12.08.2018 um 14:14 schrieb Danil Kipnis: > Fio (or some other application like key-value or object database) > submits two writes which go to the same offset in a file (or block > device). Since fio is using libaio, _both_ those writes reach md > layer. Md forwards those writes to each of its legs and waits for > confirmations to return. On one leg/disk the writes are executed in > one order and on another leg - the other way round. The order in which > the writes are executed is decided by some i.e. firmware inside each > of the two hdds, md has no possibility to enforce the same order on > each leg. And now you have one value on one leg and another on > another. Md receives both confirmations of both writes and says the > user, everything is fine. And the user will read only one of those > values all the time, at least for md-raid, where read order is static, > until of course you remove one leg, which contained this value, and > suddenly user reads the other one. > To quote Wikipedia on cap theorem, this thing „consistency: Every read > receives the most recent write or an error“, can not be guaranteed by > the raid1. > So Application must enforce it - like ext4 or any journaling file > system is doing for its meta data. Which means in the most primitive > way: do not submit two writes at the same time, wait for the first one > to return, then submit another one i see no logic here because i expect from a mirror as RAID1/RAID10 identical data on both mirrors without any but/if/or/maybe "Two threads writing with O_DIRECT io to the same address could result in different data on the two devices" makes no sense - everything talks with the RAID1 layer which is a block-device and expected to have alway the same data on both mirrors - O_DIRECT don't bypass the RAID layer because it even don't know about the phyiscal disks underneath if what ever workload (except a hard crash) leads to different data it's a bug which should be fixed better sooner than later