Re: Data corruption when using dm-crypt over RAID5

Andrey <dm-crypt-revealed-address@xxxxxxxxx> · Fri, 01 Dec 2006 00:02:27 +0300

Alasdair G Kergon wrote:
On Tue, Nov 28, 2006 at 02:22:23AM +0300, Andrey wrote:

mke2fs -b 4096 -R stride=16 /dev/mapper/v0 1572864

Please try to reproduce without a filesystem: use a program that
reads/writes known patterns onto the device so we get clues
about the nature of the corruption.

Alasdair

Some more info on this:

- it does not repro on raid1, raid5 only
- i was unable to repro it w/o filesystem
- it looks like at least stable workaround-fix was found.

The root of problem are READA requests. Today's stable fix is to return 
-EIO for all READA requests in map function. That fixes it all.
First of all, there is another rant whether dm-crypt should support 
READA bio's at all. For myself I found at least two reasons why it 
shouldn't:
1. it takes a lot of CPU for each request to pass through dm-crypt. it 
does not make sense to spend all this cpu time to fill readahead buffer. 
on my system there is no noticeable perf difference once I disabled 
READA bio's.
2. raid driver at least tries to do something about READA requests. It 
calculates ra_pages, looks at synch status, etc. dm-crypt does nothing 
like that while being on top of raid.

Another issue is why this corruption happens at all and here it looks 
very strange. Just a refresher - dm-crypt handles READA bio's same way 
as it handles a regular READ - creates a clone (actually a whole new 
bio, cloning bi_rw and bi_size, passes it down and decrypts in endio 
routine.The corruption looks the following way from inside dm-crypt:

- fs asks to write 2048 bytes of zeroes to some sector. It always 
happens ONLY with requests of 2K and only if data is all zeroes. It 
happens on random sectors. Write request goes w/o error.
- someone (i have vague idea how linux cache works)  sends a READA  bio 
with size of 4K for an area starting with the same sector. It goes down 
to raid5 and it returns garbage. The data returned  is not all zeroes 
and it _looks_ like it returns data that was there before zeroes were 
written. here goes the corruption.

If I disable READA in dm-crypt, same pattern with generic READ works fine.

Any ideas?  Should it be  investigated at all?  Anything special  
somewhere about zero-filled  writes?  I can probably add yet another ton 
of printk's into raid5 or cache code, but any hints what to look for 
would be appreciated.

              Thanks, Andrey.

---------------------------------------------------------------------
dm-crypt mailing list - http://www.saout.de/misc/dm-crypt/
To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx
For additional commands, e-mail: dm-crypt-help@xxxxxxxx