On Tue, Mar 02, 2010 at 04:01:00PM +1100, Neil Brown wrote:
On Sun, 28 Feb 2010 09:09:49 +0100
Luca Berra <bluca@xxxxxxxxxx> wrote:
On Thu, Feb 25, 2010 at 08:39:36AM +1100, Neil Brown wrote:
>On Wed, 24 Feb 2010 11:12:09 -0500
>"Martin K. Petersen" <martin.petersen@xxxxxxxxxx> wrote:
>
>> So realistically both disk blocks are wrong and there's a window until
>> the new, correct block is written. That window will only cause problems
>> if there is a crash and we'll need to recover. My main concern here is
>> how big the discrepancy between the disks can get, and whether we'll end
>> up corrupting the filesystem during recovery because we could
>> potentially be matching metadata from one disk with journal entries from
>> another.
>
>After a crash, md will only read from one of the devices (the first) until a
>resync has completed. So there should be no room for more confusion than you
>would expect on a single device.
After thinking more about this i could come up with another concern
about write ordering.
example
app writes block A, B, C
md writes A on both disks
md writes B on disk1
app writes B again (B')
md writes B' on disk2
now md would write B' again on both disks, but the system crashes
(note, C is never written due to crash)
Disk 1 contains A and B in the correct order, it is missing C and B' but we
dont care, app should be able to recover from a crash
Disk 2 contains A and B', but they are wrongly ordered because C is
missing
If in the above case A and C are data blocks and B contains a journal
related to A and C, booting from disk 2 could result in inconsistent
data.
can the above really happen?
would using barriers remove the above concern?
am i missing something else?
These is no inconsistency here that a filesystem would not equally expect
from a single device.
After the crash-while-writing B', it should expect to see either B or B',
and it does, depending on which device is primary.
Nothing to see here.
I will try to explain better,
the problem is not related to the confusion between B or B'
the problem is that on one disk we have B' _without_ C.
Regards,
L.
--
Luca Berra -- bluca@xxxxxxxxxx
Communication Media & Services S.r.l.
/"\
\ / ASCII RIBBON CAMPAIGN
X AGAINST HTML MAIL
/ \
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html