Re: Passphrase stops working.

Arno Wagner <arno@xxxxxxxxxxx> · Thu, 19 Jul 2012 03:19:18 +0200

On Wed, Jul 18, 2012 at 04:26:26PM -0700, Two Spirit wrote:
> Thanks for responding . response inline below
> 
> On Wed, Jul 18, 2012 at 3:12 PM, Arno Wagner <arno@xxxxxxxxxxx> wrote:
> 
> > Hi,
> >
> > On Wed, Jul 18, 2012 at 02:34:58PM -0700, Two Spirit wrote:
> > > Hello, I just wanted to get back to you.
> >
> > Thanks, always appreciated.
> >
[...]
> >
> > I would suspect there is a "wrap around" somwhere in the process
> > when the RAID gets a tomy bit larger than 2GB. That would write
> > right over the LUKS header. This should _not_ be happening (i.e.
> > is a bug), but would be plausible. This happens when sector numbers
> > are restricted to 2^22 by logical "and" with (2^22)-1.
> >
> 
> I assume you meant 2TB not GB. and is plausible, but I'm a nobody so I
> can't fix it even if it did. someone from this community will hopefully
> follow with the right person.

Ah, yes. 2TB of course. And you can file a bug report just like
everybody else.

> > Can you check whether the start of the other RAID drives is also
> > overwritten? The header itself is small enough to be only
> > on one disk, but the keyslot-area should get distribute on all of
> > them.
> >
> I'm not quite sure what you mean by "start of the raid". 

Not start of the RAID, start of the RAID disks. If the partition
table/Luks header vanishes from one disk, then I suspect the
start of the otehr disks gets overwritten as well, just a bit 
later. RAID5 drives are interleaved.

> Of the 4 disk
> raid5, the disks 2-3-4 seem OK, and the disk 1 is missing the msdos
> partition table which would then be missing the 0xfd linxu auto-raid
> partition which contains the 256 byte mdadm metadata v0.9 superblock.  This

Ah, no. The 0.90 RAID superblock is at the end of the partition/disk.
That is one of its advantages.

> is all while the raid is running and LUKS is already open so there is no
> other indication of anything going wrong. 

Yes. Therw eould not be, unless you run a header check or a
simple "isLuks".

> As far as I know from the raid
> point of view, nothing seems to be wrong. I was able to rebuild the raid
> array. 

The beginning of the RAID components could still have been 
overwritten as the superblock is at the end. The partition
table being missing indicates this problem does not stay
inside the RAID.

> It was more of a fluke that I found this corruption because the only
> thing I really noticed is that I couldn't get my luks open. It was
> strange(not a bug, is a feature) that there was no errors reported by mdadm
> of the corrupt state of disk1. 

Mdadm may not have a way of knowing. 

> But I was running ubuntu-8.04 so they might
> not have had good error reporting back then.

That old? Is my Google-fu is correct, thet would be 2.6.24, 
released beginning of 2008. (I see you posted that initially,
seems I did not read carefully enough.)

Ok, so maybe this is some old, obscure driver bug or the
like that is actually a problem with 4TB disks (there used to be
a 2TB drive size limit as well....), not even a RAID issue.
I think if you do not run into this with a test with a current 
kernel, you will probably be fine.

Let me re-calculate: 4TB, 512Byte sectors => 33bit sector addresses.
Only 32 bit sector addresses with 2TB. Hmmm. Highly suspicuous! 

If the sector number was indeed only handled 32 bit long, the
every write after 2TB should produce an error (good case) or
will go to the start of the disk (bad case). It is even possible
this only happens with multi-sector writes that pass the 2TB
boundary.

> > > I am not focusing my testing right now on isolating the problem, but
> > > converting to mdadm metadata-1.2 to see if the problem goes away.
> >
> > If it does, it would be a good idea to send a bug report to the
> > raid mailing list as well (no idea which one that is exactly
> > though).
> >
> Once I can find and get attached to the mailing list and get more
> understanding of the problem, I probably will, but ultimately I hope
> someone from this community will get this bug problem listed in some place
> useful.

Well, guite frankly, if the problem goes away with the new kernel, 
don't bother. Your initial report can already be googled, that 
should be enough.

And I have to say, I admire your patience and persistence. 
Hope with the new Ubuntu this just goes away.

Arno
-- 
Arno Wagner,    Dr. sc. techn., Dipl. Inform.,   Email: arno@xxxxxxxxxxx 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
One of the painful things about our time is that those who feel certainty 
are stupid, and those with any imagination and understanding are filled 
with doubt and indecision. -- Bertrand Russell 
_______________________________________________
dm-crypt mailing list
dm-crypt@xxxxxxxx
http://www.saout.de/mailman/listinfo/dm-crypt