Re: Data corruption when using dm-crypt over RAID5

Andrew Morton <akpm@xxxxxxxx> · Tue, 28 Nov 2006 11:27:46 -0800

On Tue, 28 Nov 2006 02:22:23 +0300
Andrey <andrey@xxxxxxxxx> wrote:

> Hello, all.
> 
> This is about well known "dm-crypt is broken and causes massive data 
> damage". Please read.
> 
> I'm trying to chase data corruption that happens when I use dm-crypt 
> over RAID5. About 3 years ago I build a storage server for myself that 
> uses same setup (dm-crypt partition over software RAID5 array) and used 
> it ever since without any problems. Last week I built another server 
> with same setup with only difference that it is x86_64 and disks are 
> bigger. I have a random repro of this corruption. I spent several days 
> chasing it, but still have not found the cause. Here is my setup:
> 
> - 64-bit amd cpu, 1 gb ram, 2.6.18.3 kernel
> 
> /proc/mdstat
> 
> md1 : active raid5 hdc3[2] hdb2[1] hda2[0]
>       778485504 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
> 
> Firstly, the problem can be reproduced by making following steps:
> - create crypt target
> - copy about 4GB of data (pdgflush takes about 40% CPU, and all free 
> memory goes to cache)
> - e2fsck __f
> 
> Attached is a scrip I use.
> 
> What I verified so far (in order I tried):
> 
> this is dm-crypt related. Replacing crypt with linear eliminates all 
> corruptions. i.e. "0 18874368 linear /dev/md1 1280" works while "0 
> 18874368 crypt aes-cbc-plain $KEY128 0 /dev/md1 1280" yields corruptions.
> 
> This is not crypto or iv related. Next thing I created another crypto 
> algorithm "aesfake" which does nothing but has all characteristics of 
> AES (including CPU load). To my surprise problem still reproduces. Below 
> is relevant routine from crypto module:
> static void aes_fake_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src)
> {
>     u8 tmp[AES_BLOCK_SIZE];
>     aes_enc_blk(tfm, tmp, src);
>     memmove(dst,src,AES_BLOCK_SIZE);
> }
> 
> This is not an issue of wrong sector remap or data corruption. I added a 
> huge per-cc allocation in dm-crypt (about 100MB per 9 GB volume) , where 
> I was tracking 3 things: crc of a sector, sector number, sector number 
> written (i.e. I was rewriting first 4 bytes of each sector with sector 
> number, and memorized original data in memory). The "corrupting" and 
> "check" for sector number were done in crypt_convert_scatterlist. In 
> addition, I added a code that calculated and memorized sector checksum 
> on write and verified it on reads. Checksum calculation was in map, on a 
> virgin bio before dm-crypt did anything to it. Verification was right 
> after decryption, on cloned bio's. All of this yelded NOTHING, i.e. 
> everything worked as expected.
> 
> The problem SEEMS to be in a read cycle. Since corruptions reproduce 
> even with a dummy no-op cipher, I started to disable various paths in 
> dm-crypt. I do this by converting dm-crypt to linear for some boi's by 
> adding following code to the top of map function:
>     int bypass = 0;
> 
>     // bypass certain requests
>     if (bio_data_dir(bio) == WRITE) bypass=1;
> //    if (bio_data_dir(bio) == READ) bypass=1;
> 
>     if (bypass)
>     {
>         bio->bi_bdev = cc->dev->bdev;
>         bio->bi_sector = cc->start + sector;
> 
>         mempool_free(io, cc->io_pool);
>         return 1;
>     }
> It turns out that having a read path is a requirement for trouble. If I 
> bypass write problem still persist, but bypassing read eliminates the 
> problem completely.
> 
> So far my theory is that some reads are just disappearing under heavy 
> load. This is the only thing I can think of, why all crc/sector checks 
> pass (they occur in endio routine) but corruptions still occur.
> 
> Does it make any sense? Anyone has any ideas what to check next? Is 
> there any special kernel branch that has this problem fixed long ago? 
> Any comments are welcome.
> 

Interesting.   Thanks for doing all this.

One simple theory is that a memory allocation failed under load and the
error-handling for that has a bug.  It would be good if we had a printk in
there in all error paths to see if one pops up.

Also, 2.6.19-rc6-mm2 has the fault-injection patches which can be used to
deliberately cause the dm-crypt module to experience memory allocation
failures.

---------------------------------------------------------------------
dm-crypt mailing list - http://www.saout.de/misc/dm-crypt/
To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx
For additional commands, e-mail: dm-crypt-help@xxxxxxxx