Re: [PATCH 1/2] md bitmap bug fixes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Peter T. Breuer wrote:
Paul Clements <paul.clements@xxxxxxxxxxxx> wrote:

Peter T. Breuer wrote:

I don't see that this solves anything. If you had both sides going at
once, receiving different writes, then you are sc&**ed, and no
resolution of bitmaps will help you, since both sides have received
different (legitimate) data. It doesn't seem relevant to me to consider

You're forgetting that journalling filesystems and databases have to replay their journals or transaction logs when they start up.

All I'm saying is that in a split-brain scenario, typical cluster frameworks will make two (or more) systems active at the same time. This is not necessarily fatal, because as you pointed out, if only one of those systems (let's call it system A) is really available to the outside world then you can usually simply trust the data on A and use it to sync over the other copies. But, if system B brought up a database or journalling FS on its copy of the data, then there were writes to that disk that have to be synced over. You can't simply use the bitmap on system A; you have to combine them (or else do a full resync).


What about when A comes back up? We then get a

                .--------------.
       system A |    system B  |
         nbd ---'    [raid1]   |
         |           /     \   |
      [disk]     [disk]  [nbd]-'

situation, and a resync is done (skipping clean sectors).

You're forgetting that there may be some data (uncommitted data) that didn't reach B that is on A's disk (or even vice versa).


You are saying that the journal on A (presumably not raided itself?) is
waiting to play some data into its own disk as soon as we have finished
resyncing it from B? I don't think that would be a good idea at all.

No, I'm simply saying this: when you fail over from system A to system B (say you lost the network or system A died), there is some amount of data that could be out of sync (because raid1 submits writes to all disks simultaneously). When you take over using the data on system B, you're presumably going to want to (at some point) get A back to a state where it has the latest data (in case B later fails or in case A is a better system and you want to make it active, instead of B). To do that, you can't simply take the bitmap from B and sync back to A. You've got to look at the old bitmap on A and combine it with B's bitmap (or you've got to do a full resync). Until you've done that, the data that is on A is worthless.
-
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux