Re: large container files

Arno Wagner <arno@xxxxxxxxxxx> · Sun, 26 Apr 2009 19:37:48 +0200

In principle, your loss probability on HDDs is  a bit different from 
what I gather you assume.

1. There is defective sectors. That would likely be what you are 
   thinking off. There is no hard data, but my impression is that
   these are more tied to the device than to the number of secors on
   it. I.e. the probability of a sector going bad is something 
   like constant_a + constant_b * number_of_sectors_on_device.

2. You have whole-disk losses. These seem not to depend on the 
   actual disk size or may even be less likely for larger (hence more
   modern) drives.

Now for your question: You lose a LUKS volume, when you lose the
LUKS header. All other errors do not amplify, i.e. a raw bad
sector just transforms into one encryoted bad sector.

For losing the LUKS header, you actuially need to lose the salt or
other truff right in the beginning. The probability of that happening
is _less_ for larger decives, if my intuition 1. above is correct.
The second way to lose the filesystem, is by a bad sector in a 
key-block, if you have only one. The same reasoning applies, however
the key-blocks are in the MB size range, i.e. damage to one of them
is something like 3 orders of magnitude more likely.

As to the full disk-loss, that depends on how you build your large
device and how much redundancy you have. For 2.5TB, I would
recommend using RAID6 (works well in software under Linux).

Lastly, this is a container-file, not a device. AFAIK there
is a very low file corruption risk, if you do not write the 
strucutural metadata (i.e. the inode assignmet) with LINUX
filesystem, i.e. if you do not append to the file. In fact, 
I have done extensive mesurements on data 
in the multi-TB range with individual files in the 200MB-1GB 
range, most on ext3. I bever lost a file. In addition, I do not
see any reason why the loss-probability should be a lot larger
with larger files. True, you could have an unrecoverable
bad sector right in the metadata, but the metadata is so tiny
compared to the data area, that this is still highly unlikely
for a 2.5TB file. I think full-disk loss or array-loss for
a RAID array is a lot more likely. 

On the other hand, if you split this into smaller files and
then lvm them together, you actually get more metadata and
therefore a higher risk.

So, short answer: There are risks with this much data, but
a single file is a low-risk approach, and only
moderately more risky than using a partition.
This does however not include operator error. A file
is easier to delete than a partition and there is typically
no undelete on Linux. Backuop is, as allways, non-optional.

What you should of course do is runn regular disk surface
scans (long SMART selftests) and filesystem chesks.

Arno

On Sun, Apr 26, 2009 at 12:21:04PM +0200, ingo.schmitt@xxxxxxxxxxxxxxxxx wrote:
> Hi,
> 
> is it a good idea to use a container file which is 2,5TB large?
> is there higher risk to lose data when the file is so large?
> 
> I cannot use the whole partition because the dirives are managed by lvm
> and this is too complicated for me.
> 
> Thx,
> Ingo
> 
> 
> 
> 
> ---------------------------------------------------------------------
> dm-crypt mailing list - http://www.saout.de/misc/dm-crypt/
> To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx
> For additional commands, e-mail: dm-crypt-help@xxxxxxxx
> 

-- 
Arno Wagner, Dr. sc. techn., Dipl. Inform., CISSP -- Email: arno@xxxxxxxxxxx 
GnuPG:  ID: 1E25338F  FP: 0C30 5782 9D93 F785 E79C  0296 797F 6B50 1E25 338F
----
Cuddly UI's are the manifestation of wishful thinking. -- Dylan Evans

If it's in the news, don't worry about it.  The very definition of 
"news" is "something that hardly ever happens." -- Bruce Schneier 

---------------------------------------------------------------------
dm-crypt mailing list - http://www.saout.de/misc/dm-crypt/
To unsubscribe, e-mail: dm-crypt-unsubscribe@xxxxxxxx
For additional commands, e-mail: dm-crypt-help@xxxxxxxx