Hi John, Thank you for your quick response, I'll try to elaborate further. What we are trying to understand is if there is a potential race between reads and writes when mirroring 2 devices. This is unrelated to the fact that the write was not acked. The scenario is: let's assume we have a reader R and a writer W and 2 MD devices A and B. A and B are managed under a device M which is configured to use A and B as mirrors (RAID 1). Currently, we have some data on A and B, let's call it V1. W issues a write (V2) to the managed device M The driver sends the write both to A and B at the same time. The write to device A (V2) completes R issues a read to M which directs it to A and returns the result (V2). Now the driver and device A fail at the same time before the write ever gets to device B. When the driver recovers all it is left with is device B so future reads will return older data (V1) than the data that was returned to R. Thanks, Asaf On Fri, Mar 17, 2023 at 10:58 PM John Stoffel <john@xxxxxxxxxxx> wrote: > > >>>>> "Ronnie" == Ronnie Lazar <ronnie.lazar@xxxxxxxxxxxx> writes: > > > I'm trying to understand how mdadm protects against inconsistent data > > read in the face of failures that occur while writing to a device that > > has raid1. > > You need to give a better test case, with examples. > > > Here is the scenario: I have set up raid1 that has 2 mirrors. First > > one is on local storage and the second is on remote storage. The > > remote storage mirror is configured with write-mostly. > > Configuration details? And what is the remote device? > > > We have parallel jobs: 1 writing to an area on the device and the > > other reading from that area. > > So you create /dev/md9 and are writing/reading from it, then the > system crashes and you lose the local half of the mirror, right? > > > The write operation writes the data to the first mirror, and at that > > point the read operation reads the new data from the first mirror. > > So how is your write succeeding if it's not written to both halves of > the MD device? You need to give more details and maybe even some > example code showing what you're doing here. > > > Now, before data has been written to the second (remote) mirror a > > failure has occurred which caused the first machine to fail, When > > the machine comes up, the data is recovered from the second, remote, > > mirror. > > Ah... some more details. It sounds like you have a system A which is > writing to a SITE local remote device as well as a REMOTE site device > in the MD mirror, is this correct? > > Are these iSCSI devices? FibreChannel? NBD devices? More details > please. > > > Now when reading from this area, the users will receive the older > > value, even though, in the first read they got the newer value that > > was written. > > > Does mdadm protect against this inconsistency? > > It shouldn't be returning success on the write until both sides of the > mirror are updated. But we can't really tell until you give more > details and an example. > > I assume you're not building a RAID1 device and then writing to the > individual devices behind it's back or something silly like that, > right? > > John >