On Tue, 28 Nov 2006 02:22:23 +0300 Andrey <andrey@xxxxxxxxx> wrote: > Hello, all. > > This is about well known "dm-crypt is broken and causes massive data > damage". Please read. > > I'm trying to chase data corruption that happens when I use dm-crypt > over RAID5. About 3 years ago I build a storage server for myself that > uses same setup (dm-crypt partition over software RAID5 array) and used > it ever since without any problems. Last week I built another server > with same setup with only difference that it is x86_64 and disks are > bigger. I have a random repro of this corruption. I spent several days > chasing it, but still have not found the cause. Here is my setup: > > - 64-bit amd cpu, 1 gb ram, 2.6.18.3 kernel > > /proc/mdstat > > md1 : active raid5 hdc3[2] hdb2[1] hda2[0] > 778485504 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_] > > Firstly, the problem can be reproduced by making following steps: > - create crypt target > - copy about 4GB of data (pdgflush takes about 40% CPU, and all free > memory goes to cache) > - e2fsck __f > > Attached is a scrip I use. > > What I verified so far (in order I tried): > > this is dm-crypt related. Replacing crypt with linear eliminates all > corruptions. i.e. "0 18874368 linear /dev/md1 1280" works while "0 > 18874368 crypt aes-cbc-plain $KEY128 0 /dev/md1 1280" yields corruptions. > > This is not crypto or iv related. Next thing I created another crypto > algorithm "aesfake" which does nothing but has all characteristics of > AES (including CPU load). To my surprise problem still reproduces. Below > is relevant routine from crypto module: > static void aes_fake_encrypt(struct crypto_tfm *tfm, u8 *dst, const u8 *src) > { > u8 tmp[AES_BLOCK_SIZE]; > aes_enc_blk(tfm, tmp, src); > memmove(dst,src,AES_BLOCK_SIZE); > } > > This is not an issue of wrong sector remap or data corruption. I added a > huge per-cc allocation in dm-crypt (about 100MB per 9 GB volume) , where > I was tracking 3 things: crc of a sector, sector number, sector number > written (i.e. I was rewriting first 4 bytes of each sector with sector > number, and memorized original data in memory). The "corrupting" and > "check" for sector number were done in crypt_convert_scatterlist. In > addition, I added a code that calculated and memorized sector checksum > on write and verified it on reads. Checksum calculation was in map, on a > virgin bio before dm-crypt did anything to it. Verification was right > after decryption, on cloned bio's. All of this yelded NOTHING, i.e. > everything worked as expected. > > The problem SEEMS to be in a read cycle. Since corruptions reproduce > even with a dummy no-op cipher, I started to disable various paths in > dm-crypt. I do this by converting dm-crypt to linear for some boi's by > adding following code to the top of map function: > int bypass = 0; > > // bypass certain requests > if (bio_data_dir(bio) == WRITE) bypass=1; > // if (bio_data_dir(bio) == READ) bypass=1; > > if (bypass) > { > bio->bi_bdev = cc->dev->bdev; > bio->bi_sector = cc->start + sector; > > mempool_free(io, cc->io_pool); > return 1; > } > It turns out that having a read path is a requirement for trouble. If I > bypass write problem still persist, but bypassing read eliminates the > problem completely. > > So far my theory is that some reads are just disappearing under heavy > load. This is the only thing I can think of, why all crc/sector checks > pass (they occur in endio routine) but corruptions still occur. > > Does it make any sense? Anyone has any ideas what to check next? Is > there any special kernel branch that has this problem fixed long ago? > Any comments are welcome. > Interesting. Thanks for doing all this. One simple theory is that a memory allocation failed under load and the error-handling for that has a bug. It would be good if we had a printk in there in all error paths to see if one pops up. Also, 2.6.19-rc6-mm2 has the fault-injection patches which can be used to deliberately cause the dm-crypt module to experience memory allocation failures. --------------------------------------------------------------------- dm-crypt mailing list - http://www.saout.de/misc/dm-crypt/ To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx For additional commands, e-mail: dm-crypt-help@xxxxxxxx