Re: Why does one get mismatches?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 25 Feb 2010 08:22:10 +0100
Goswin von Brederlow <goswin-v-b@xxxxxx> wrote:

> Neil Brown <neilb@xxxxxxx> writes:
> 
> > On Wed, 24 Feb 2010 09:46:23 -0500
> > Bill Davidsen <davidsen@xxxxxxx> wrote:
> >
> >> > There is no question of data corruption.
> >> > When memory changes between being written to one device and to another, this
> >> > does not cause corruption, only inconsistency.   Either the block will be
> >> > written again consistently soon, or it will never be read.
> >> >     
> >> 
> >> Just what is it that rewrites the data block? The user program doesn't 
> >> know it's needed, the filesystem, if any, doesn't know it's needed, and 
> >> as far as I can tell md doesn't do checksum before issuing the write and 
> >> after the last write is done. Doesn't make a copy and write from that. 
> >> So what sees that the data has changed and rewrites it?
> >> 
> >
> > The filesystem re-writes the block, though probably it is more accurate to
> > say 'the page cache' rewrites the block (the page cache is essentially just a
> > library of code that the filesystem uses).
> >
> > When a page is changed, its 'Dirty' flag is set.
> > Before a page is written out, the Dirty flag is cleared.
> > So if a page is written differently to two devices, then it must have been
> > changed after the Dirty flag was clear, so the Dirty flag will be set, so the
> > page cache will try to write it out again (after about 30 seconds or at
> > unmount time).
> 
> So maybe MD could check the dirty flag after write and then output a
> warning so we can track down the issue. MD could also rewrite the page
> prior to setting the disks in-sync until the dirty bit is clear after a
> write.

md isn't able to see the dirty bit.

It gets a 'bio', which has a 'biovec' which has a list of pages
with offset and size.
It does not know if the page is in the page cache or not so it cannot know if
the dirty flag on the page means anything or not.

Yes, it technically could check the dirty bit and if it sees any of them set
then it could reschedule the writes. however,
 1- this is a layering violation - it is the wrong thing to do.
 2- it might not work.  The filesystem could keep the 'dirty' status elsewhere
    such as in a 'buffer_head', and only copy it through to the page
    occasionally.
 3- it could cause a live-lock.  If an application is changing a mapped page
    quite regularly, then the current pagecache will write it out every 30
    seconds or so.  Your proposed change would write it out again and again
    as soon as the previous write completes.

So, no:  we cannot do that.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux