Re: Why does one get mismatches?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 1, 2010 at 11:36 PM, Luca Berra <bluca@xxxxxxxxxx> wrote:
> On Tue, Mar 02, 2010 at 04:01:00PM +1100, Neil Brown wrote:
>>
>> On Sun, 28 Feb 2010 09:09:49 +0100
>> Luca Berra <bluca@xxxxxxxxxx> wrote:
>>
>>> On Thu, Feb 25, 2010 at 08:39:36AM +1100, Neil Brown wrote:
>>> >On Wed, 24 Feb 2010 11:12:09 -0500
>>> >"Martin K. Petersen" <martin.petersen@xxxxxxxxxx> wrote:
>>> >
>>> >> So realistically both disk blocks are wrong and there's a window until
>>> >> the new, correct block is written.  That window will only cause
>>> >> problems
>>> >> if there is a crash and we'll need to recover.  My main concern here
>>> >> is
>>> >> how big the discrepancy between the disks can get, and whether we'll
>>> >> end
>>> >> up corrupting the filesystem during recovery because we could
>>> >> potentially be matching metadata from one disk with journal entries
>>> >> from
>>> >> another.
>>> >
>>> >After a crash, md will only read from one of the devices (the first)
>>> > until a
>>> >resync has completed.  So there should be no room for more confusion
>>> > than you
>>> >would expect on a single device.
>>>
>>> After thinking more about this i could come up with another concern
>>> about write ordering.
>>>
>>> example
>>> app writes block A, B, C
>>> md writes A on both disks
>>> md writes B on disk1
>>> app writes B again (B')
>>> md writes B' on disk2
>>> now md would write B' again on both disks, but the system crashes
>>> (note, C is never written due to crash)
>>>
>>> Disk 1 contains A and B in the correct order, it is missing C and B' but
>>> we
>>> dont care, app should be able to recover from a crash
>>>
>>> Disk 2 contains A and B', but they are wrongly ordered because C is
>>> missing
>>>
>>> If in the above case A and C are data blocks and B contains a journal
>>> related to A and C, booting from disk 2 could result in inconsistent
>>> data.
>>>
>>> can the above really happen?
>>> would using barriers remove the above concern?
>>> am i missing something else?
>>
>> These is no inconsistency here that a filesystem would not equally expect
>> from a single device.
>> After the crash-while-writing B', it should expect to see either B or B',
>> and it does, depending on which device is primary.
>>
>> Nothing to see here.
>
> I will try to explain better,
> the problem is not related to the confusion between B or B'
>
> the problem is that on one disk we have B' _without_ C.
>
> Regards,
> L.
>
> --
> Luca Berra -- bluca@xxxxxxxxxx
>        Communication Media & Services S.r.l.
>  /"\
>  \ /     ASCII RIBBON CAMPAIGN
>  X        AGAINST HTML MAIL
>  / \
> --
> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>

You're demanding full atomic commits; this is precisely what journals
and /barriers/ are for.

Are you are bypassing them in a quest for performance and paying for
it on crashes?
Or is this a hardware bug?
Or is it some glitch in the block device layering leading to barrier
requests not being honored?
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux