Re: dm-crypt is broken and causes massive data corruption

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Alasdair G Kergon schrieb:
On Wed, May 24, 2006 at 08:02:16AM +0000, Kevin Eilers wrote:
If you need any help in resolving this problem (means: testing patches, sending you logfiles etc.), I'd be glad to help.
What we need is some careful testing changing just one thing at a time,
attempting to discover other configurations that show similar corruption:
We still don't know which kernel subsystem is the one with the bug.

e.g.
  Replace dm-crypt with dm-linear (ie standard logical volume)
  Replace md raid5 with md linear
  Older upstream kernels (say from 2.6.12)
  Latest upstream git & -mm kernel (to pick up recent md patches)
  Readahead disabled
  Different cyphers (incl. null?)

Alasdair

Just to throw my infomation in:

I have two machines with huge RAID-6 arrays under 2.6.13 (SuSE 10.0) with EXT3 filesystems and AES ciphers. Under DM-crypt, I am experiencing these corruptions, too. The systems were perfectly stable without encryption.

I had some problems with CRYPTOLOOP (one can chose compatible parameters): the systems would show strange behaviour, then (sometimes network does not respond, sometimes all processes accessing the filesystem hang). This seemed like a deadlock or race-condition between LOOP and RAID to me, but could also point to a problem in the optimized aes-i586 module, which is shared by both DM-crypt and CRYPTOLOOP.

Thus, I switched to DM-crypt, which seemed stable on two of my non-raid systems (one employs AES-Padlock and not aes-i586, however).

Because of the LOOP problems, I switched to DM-crypt and have since filesystem corruptions under heavy load (e.g. inserting and deleting many symlinks).

One time, I was so (un)lucky as to create the EXT3 filesystem on the mapped volume and got a corruption on the first access. To save time, instead of the usual unmount, fsck and mount routine, I simply unmounted and recreated the filesystem on top of the still-active DM mapping. To my complete disbelief, after mounting the newly-created filesystem, the first access had a corruption again!

So, it seems that once the mapping is broken, it stays broken until it is removed and recreated with cryptsetup.

To rule out the AES cipher (or to be more exact, the i586 assembler implementation) I am now trying Twofish now and see if the problem goes away, although I doubt that. It seems related to the combination of encryption and RAID.

I saw that someone else tried with dm-linear instead of dm-crypt with no luck, and the dm-crypt.c patch regarding the barrier block does not seem to help either.

Resolving this issue is rather urgent for me, too!


Uwe
--
f+gmane@xxxxxxxxxxx


---------------------------------------------------------------------
 - http://www.saout.de/misc/dm-crypt/
To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx
For additional commands, e-mail: dm-crypt-help@xxxxxxxx


[Index of Archives]     [Device Mapper Devel]     [Fedora Desktop]     [ATA RAID]     [Fedora Marketing]     [Fedora Packaging]     [Fedora SELinux]     [Yosemite News]     [KDE Users]     [Fedora Tools]     [Fedora Docs]

  Powered by Linux