Alasdair G Kergon schrieb:
On Wed, May 24, 2006 at 08:02:16AM +0000, Kevin Eilers wrote:
If you need any help in
resolving this problem (means: testing patches, sending you logfiles etc.), I'd
be glad to help.
What we need is some careful testing changing just one thing at a time,
attempting to discover other configurations that show similar corruption:
We still don't know which kernel subsystem is the one with the bug.
e.g.
Replace dm-crypt with dm-linear (ie standard logical volume)
Replace md raid5 with md linear
Older upstream kernels (say from 2.6.12)
Latest upstream git & -mm kernel (to pick up recent md patches)
Readahead disabled
Different cyphers (incl. null?)
Alasdair
Just to throw my infomation in:
I have two machines with huge RAID-6 arrays under 2.6.13 (SuSE 10.0)
with EXT3 filesystems and AES ciphers. Under DM-crypt, I am experiencing
these corruptions, too. The systems were perfectly stable without
encryption.
I had some problems with CRYPTOLOOP (one can chose compatible
parameters): the systems would show strange behaviour, then (sometimes
network does not respond, sometimes all processes accessing the
filesystem hang). This seemed like a deadlock or race-condition between
LOOP and RAID to me, but could also point to a problem in the optimized
aes-i586 module, which is shared by both DM-crypt and CRYPTOLOOP.
Thus, I switched to DM-crypt, which seemed stable on two of my non-raid
systems (one employs AES-Padlock and not aes-i586, however).
Because of the LOOP problems, I switched to DM-crypt and have since
filesystem corruptions under heavy load (e.g. inserting and deleting
many symlinks).
One time, I was so (un)lucky as to create the EXT3 filesystem on the
mapped volume and got a corruption on the first access. To save time,
instead of the usual unmount, fsck and mount routine, I simply unmounted
and recreated the filesystem on top of the still-active DM mapping. To
my complete disbelief, after mounting the newly-created filesystem, the
first access had a corruption again!
So, it seems that once the mapping is broken, it stays broken until it
is removed and recreated with cryptsetup.
To rule out the AES cipher (or to be more exact, the i586 assembler
implementation) I am now trying Twofish now and see if the problem goes
away, although I doubt that. It seems related to the combination of
encryption and RAID.
I saw that someone else tried with dm-linear instead of dm-crypt with no
luck, and the dm-crypt.c patch regarding the barrier block does not seem
to help either.
Resolving this issue is rather urgent for me, too!
Uwe
--
f+gmane@xxxxxxxxxxx
---------------------------------------------------------------------
- http://www.saout.de/misc/dm-crypt/
To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx
For additional commands, e-mail: dm-crypt-help@xxxxxxxx