Alasdair G Kergon wrote:
On Tue, Nov 28, 2006 at 02:22:23AM +0300, Andrey wrote:
mke2fs -b 4096 -R stride=16 /dev/mapper/v0 1572864
Please try to reproduce without a filesystem: use a program that
reads/writes known patterns onto the device so we get clues
about the nature of the corruption.
Alasdair
Some more info on this:
- it does not repro on raid1, raid5 only
- i was unable to repro it w/o filesystem
- it looks like at least stable workaround-fix was found.
The root of problem are READA requests. Today's stable fix is to return
-EIO for all READA requests in map function. That fixes it all.
First of all, there is another rant whether dm-crypt should support
READA bio's at all. For myself I found at least two reasons why it
shouldn't:
1. it takes a lot of CPU for each request to pass through dm-crypt. it
does not make sense to spend all this cpu time to fill readahead buffer.
on my system there is no noticeable perf difference once I disabled
READA bio's.
2. raid driver at least tries to do something about READA requests. It
calculates ra_pages, looks at synch status, etc. dm-crypt does nothing
like that while being on top of raid.
Another issue is why this corruption happens at all and here it looks
very strange. Just a refresher - dm-crypt handles READA bio's same way
as it handles a regular READ - creates a clone (actually a whole new
bio, cloning bi_rw and bi_size, passes it down and decrypts in endio
routine.The corruption looks the following way from inside dm-crypt:
- fs asks to write 2048 bytes of zeroes to some sector. It always
happens ONLY with requests of 2K and only if data is all zeroes. It
happens on random sectors. Write request goes w/o error.
- someone (i have vague idea how linux cache works) sends a READA bio
with size of 4K for an area starting with the same sector. It goes down
to raid5 and it returns garbage. The data returned is not all zeroes
and it _looks_ like it returns data that was there before zeroes were
written. here goes the corruption.
If I disable READA in dm-crypt, same pattern with generic READ works fine.
Any ideas? Should it be investigated at all? Anything special
somewhere about zero-filled writes? I can probably add yet another ton
of printk's into raid5 or cache code, but any hints what to look for
would be appreciated.
Thanks, Andrey.
---------------------------------------------------------------------
dm-crypt mailing list - http://www.saout.de/misc/dm-crypt/
To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx
For additional commands, e-mail: dm-crypt-help@xxxxxxxx