Re: shown disk sizes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 2013-07-18 at 09:43 +1000, NeilBrown wrote: 
> > --examine gives for both devices:
> > Avail Dev Size : 20969472 (10.00 GiB 10.74 GB)
> >      Array Size : 20969328 (10.00 GiB 10.74 GB)
> >   Used Dev Size : 20969328 (10.00 GiB 10.74 GB)
> >     Data Offset : 2048 sectors
> >    Super Offset : 8 sectors
> > 
> > => Avail is the available payload size on each component device,... so
> > given that we have the first 2048S for the superblock/bitmap/etc... that
> > fits exactly.
> > 
> > => Why is the array size / used dev size smaller?
> 
> Good question.  Not easy to answer ... it is rather convoluted.  Different
> bits of code try to reserve space for things differently and they don't end
> up agreeing.  I might try to simplify that.


I played a bit more around here,... and got even more confused:

Made two files for losetup:
-rw-r--r--  1 root root 524288000 Jul 23 17:41 image1
-rw-r--r--  1 root root 524288000 Jul 23 17:41 image2

And a RAID1 out of it:
mdadm --create /dev/md/raid3 --verbose --metadata=1.2 --size=max
--level=raid1  --name=raid3  --raid-devices=2 /dev/loop0 /dev/loop1

Examine says:
# mdadm --examine /dev/loop0
/dev/loop0:
Avail Dev Size : 1023488 (499.83 MiB 524.03 MB)
     Array Size : 511680 (499.77 MiB 523.96 MB)
  Used Dev Size : 1023360 (499.77 MiB 523.96 MB)
    Data Offset : 512 sectors
   Super Offset : 8 sectors


Fines, so 524288000/512 - 512S = exactly the 1023488 Avail Dev size
Array Size is in 1K and Used Dev Size in S, so these are identical.

Questions
1) How does mdadm choose the data alignment? It seems to also use
completely "odd" numbers like 262144 sectors

2) Again, some sectors (here 128 S) are missing... :(


Now I did some fun:
cat /dev/md/raid > image
-rw-r--r--  1 root root 523960320 Jul 23 17:43 image
=> which is just the ArraySize / Used Dev Size... hurray...
I edited that file with an hexeditor with the following changes: 
# hd i
00000000  43 41 4c 45 53 54 59 4f  5f 42 45 47 49 4e 00 00  |CALESTYO_BEGIN..|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
1f3afff0  00 00 00 00 43 41 4c 45  53 54 59 4f 5f 45 4e 44  |....CALESTYO_END|
1f3b0000

and wrote it back (cat image > /dev/md/raid).

Of course I couldn't resist to stop the raid and directly read the
losetup image files:
# hd image1 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000  fc 4e 2b a9 01 00 00 00  00 00 00 00 00 00 00 00  |.N+.............|
00001010  99 05 68 13 83 97 70 53  bd 98 b6 9e 04 9c 1a 79  |..h...pS.......y|
00001020  6c 63 67 2d 6c 72 7a 2d  70 75 70 70 65 74 3a 72  |lcg-lrz-puppet:r|
00001030  61 69 64 33 00 00 00 00  00 00 00 00 00 00 00 00  |aid3............|
00001040  ed a2 ee 51 00 00 00 00  01 00 00 00 00 00 00 00  |...Q............|
00001050  80 9d 0f 00 00 00 00 00  00 00 00 00 02 00 00 00  |................|
00001060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001080  00 02 00 00 00 00 00 00  00 9e 0f 00 00 00 00 00  |................|
00001090  08 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
000010a0  00 00 00 00 00 00 00 00  d4 58 14 b3 8c 07 e4 88  |.........X......|
000010b0  fa f7 61 90 83 2b 4c 0d  00 00 00 00 00 00 00 00  |..a..+L.........|
000010c0  2c a4 ee 51 00 00 00 00  13 00 00 00 00 00 00 00  |,..Q............|
000010d0  ff ff ff ff ff ff ff ff  c3 52 2b 16 80 00 00 00  |.........R+.....|
000010e0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001100  00 00 01 00 fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
00001110  fe ff fe ff fe ff fe ff  fe ff fe ff fe ff fe ff  |................|
*
00001200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000  62 69 74 6d 04 00 00 00  35 c5 62 da cf d2 21 ea  |bitm....5.b...!.|
00002010  1a e5 49 74 92 7b 49 2e  00 00 00 00 00 00 00 00  |..It.{I.........|
00002020  00 00 00 00 00 00 00 00  80 9d 0f 00 00 00 00 00  |................|
00002030  00 00 00 00 00 00 00 04  05 00 00 00 00 00 00 00  |................|
00002040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002100  ff ff ff ff ff ff ff ff  ff ff ff ff ff ff ff ff  |................|
*
00002200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00040000  43 41 4c 45 53 54 59 4f  5f 42 45 47 49 4e 00 00  |CALESTYO_BEGIN..|
00040010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
1f3efff0  00 00 00 00 43 41 4c 45  53 54 59 4f 5f 45 4e 44  |....CALESTYO_END|
1f3f0000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
1f400000

(they are the same (especially the addresses) except some places in the
header)

Okay here we go: 0x00040000 = 262144 B = 512 S... hurray... the data
starts at the data offset (who'd have expected this? :P)

The total size: 0x1f400000 = 524288000 B = 1024000 S, which is just the
size of my losetup image files

Total size 0x1f400000 - 0x0x00040000 = 524025856 B = 1023488 S, which is
again the Avail Size... (i.e. NOT the array size)

And the 1f400000-1f3f0000 are just the "missing" 128S.
Jihaw...

3) Stupid question... are these 128S kept free for the 0.9/1.0
superblock-in-the-end disease? ;)
If so,... I'd have expected that this region is at least 64K and at most
128K large,... but I've also had this:
# mdadm --examine /dev/loop3
/dev/loop3:
          Magic : a92b4efc
        Version : 1.2
    Feature Map : 0x1
     Array UUID : 968d9b80:d8714965:c3cb34cd:4d2952e6
           Name : lcg-lrz-puppet:raid2  (local to host lcg-lrz-puppet)
  Creation Time : Tue Jul 23 17:15:23 2013
     Raid Level : raid1
   Raid Devices : 2

Avail Dev Size : 1048313856 (499.88 GiB 536.74 GB)
     Array Size : 524156736 (499.87 GiB 536.74 GB)
  Used Dev Size : 1048313472 (499.87 GiB 536.74 GB)
    Data Offset : 262144 sectors
   Super Offset : 8 sectors
          State : active
    Device UUID : bbcfa311:0e61c19a:5eafe78e:f53b1c15

Internal Bitmap : 8 sectors from superblock
    Update Time : Tue Jul 23 17:15:23 2013
       Checksum : 23cbc3d7 - correct
         Events : 0

=> and there we have 384 S...



I was working on some spreadsheat which should give one, starting from a
few variables like first sector of the MD's component device partition,
and so on... and especially the desired (usabel) array size... the
necessary size for the the component device.
Obviously, as long as I can't calculate how the size of the superblock
comes together... I have no real chance


Cheers,
Chris.

<<attachment: smime.p7s>>


[Index of Archives]     [Linux RAID Wiki]     [ATA RAID]     [Linux SCSI Target Infrastructure]     [Linux Block]     [Linux IDE]     [Linux SCSI]     [Linux Hams]     [Device Mapper]     [Device Mapper Cryptographics]     [Kernel]     [Linux Admin]     [Linux Net]     [GFS]     [RPM]     [git]     [Yosemite Forum]


  Powered by Linux