On Thu, 2013-07-18 at 09:43 +1000, NeilBrown wrote: > > --examine gives for both devices: > > Avail Dev Size : 20969472 (10.00 GiB 10.74 GB) > > Array Size : 20969328 (10.00 GiB 10.74 GB) > > Used Dev Size : 20969328 (10.00 GiB 10.74 GB) > > Data Offset : 2048 sectors > > Super Offset : 8 sectors > > > > => Avail is the available payload size on each component device,... so > > given that we have the first 2048S for the superblock/bitmap/etc... that > > fits exactly. > > > > => Why is the array size / used dev size smaller? > > Good question. Not easy to answer ... it is rather convoluted. Different > bits of code try to reserve space for things differently and they don't end > up agreeing. I might try to simplify that. I played a bit more around here,... and got even more confused: Made two files for losetup: -rw-r--r-- 1 root root 524288000 Jul 23 17:41 image1 -rw-r--r-- 1 root root 524288000 Jul 23 17:41 image2 And a RAID1 out of it: mdadm --create /dev/md/raid3 --verbose --metadata=1.2 --size=max --level=raid1 --name=raid3 --raid-devices=2 /dev/loop0 /dev/loop1 Examine says: # mdadm --examine /dev/loop0 /dev/loop0: Avail Dev Size : 1023488 (499.83 MiB 524.03 MB) Array Size : 511680 (499.77 MiB 523.96 MB) Used Dev Size : 1023360 (499.77 MiB 523.96 MB) Data Offset : 512 sectors Super Offset : 8 sectors Fines, so 524288000/512 - 512S = exactly the 1023488 Avail Dev size Array Size is in 1K and Used Dev Size in S, so these are identical. Questions 1) How does mdadm choose the data alignment? It seems to also use completely "odd" numbers like 262144 sectors 2) Again, some sectors (here 128 S) are missing... :( Now I did some fun: cat /dev/md/raid > image -rw-r--r-- 1 root root 523960320 Jul 23 17:43 image => which is just the ArraySize / Used Dev Size... hurray... I edited that file with an hexeditor with the following changes: # hd i 00000000 43 41 4c 45 53 54 59 4f 5f 42 45 47 49 4e 00 00 |CALESTYO_BEGIN..| 00000010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 1f3afff0 00 00 00 00 43 41 4c 45 53 54 59 4f 5f 45 4e 44 |....CALESTYO_END| 1f3b0000 and wrote it back (cat image > /dev/md/raid). Of course I couldn't resist to stop the raid and directly read the losetup image files: # hd image1 00000000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00001000 fc 4e 2b a9 01 00 00 00 00 00 00 00 00 00 00 00 |.N+.............| 00001010 99 05 68 13 83 97 70 53 bd 98 b6 9e 04 9c 1a 79 |..h...pS.......y| 00001020 6c 63 67 2d 6c 72 7a 2d 70 75 70 70 65 74 3a 72 |lcg-lrz-puppet:r| 00001030 61 69 64 33 00 00 00 00 00 00 00 00 00 00 00 00 |aid3............| 00001040 ed a2 ee 51 00 00 00 00 01 00 00 00 00 00 00 00 |...Q............| 00001050 80 9d 0f 00 00 00 00 00 00 00 00 00 02 00 00 00 |................| 00001060 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00001080 00 02 00 00 00 00 00 00 00 9e 0f 00 00 00 00 00 |................| 00001090 08 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| 000010a0 00 00 00 00 00 00 00 00 d4 58 14 b3 8c 07 e4 88 |.........X......| 000010b0 fa f7 61 90 83 2b 4c 0d 00 00 00 00 00 00 00 00 |..a..+L.........| 000010c0 2c a4 ee 51 00 00 00 00 13 00 00 00 00 00 00 00 |,..Q............| 000010d0 ff ff ff ff ff ff ff ff c3 52 2b 16 80 00 00 00 |.........R+.....| 000010e0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00001100 00 00 01 00 fe ff fe ff fe ff fe ff fe ff fe ff |................| 00001110 fe ff fe ff fe ff fe ff fe ff fe ff fe ff fe ff |................| * 00001200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00002000 62 69 74 6d 04 00 00 00 35 c5 62 da cf d2 21 ea |bitm....5.b...!.| 00002010 1a e5 49 74 92 7b 49 2e 00 00 00 00 00 00 00 00 |..It.{I.........| 00002020 00 00 00 00 00 00 00 00 80 9d 0f 00 00 00 00 00 |................| 00002030 00 00 00 00 00 00 00 04 05 00 00 00 00 00 00 00 |................| 00002040 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00002100 ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff ff |................| * 00002200 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00040000 43 41 4c 45 53 54 59 4f 5f 42 45 47 49 4e 00 00 |CALESTYO_BEGIN..| 00040010 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 1f3efff0 00 00 00 00 43 41 4c 45 53 54 59 4f 5f 45 4e 44 |....CALESTYO_END| 1f3f0000 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 1f400000 (they are the same (especially the addresses) except some places in the header) Okay here we go: 0x00040000 = 262144 B = 512 S... hurray... the data starts at the data offset (who'd have expected this? :P) The total size: 0x1f400000 = 524288000 B = 1024000 S, which is just the size of my losetup image files Total size 0x1f400000 - 0x0x00040000 = 524025856 B = 1023488 S, which is again the Avail Size... (i.e. NOT the array size) And the 1f400000-1f3f0000 are just the "missing" 128S. Jihaw... 3) Stupid question... are these 128S kept free for the 0.9/1.0 superblock-in-the-end disease? ;) If so,... I'd have expected that this region is at least 64K and at most 128K large,... but I've also had this: # mdadm --examine /dev/loop3 /dev/loop3: Magic : a92b4efc Version : 1.2 Feature Map : 0x1 Array UUID : 968d9b80:d8714965:c3cb34cd:4d2952e6 Name : lcg-lrz-puppet:raid2 (local to host lcg-lrz-puppet) Creation Time : Tue Jul 23 17:15:23 2013 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 1048313856 (499.88 GiB 536.74 GB) Array Size : 524156736 (499.87 GiB 536.74 GB) Used Dev Size : 1048313472 (499.87 GiB 536.74 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : active Device UUID : bbcfa311:0e61c19a:5eafe78e:f53b1c15 Internal Bitmap : 8 sectors from superblock Update Time : Tue Jul 23 17:15:23 2013 Checksum : 23cbc3d7 - correct Events : 0 => and there we have 384 S... I was working on some spreadsheat which should give one, starting from a few variables like first sector of the MD's component device partition, and so on... and especially the desired (usabel) array size... the necessary size for the the component device. Obviously, as long as I can't calculate how the size of the superblock comes together... I have no real chance Cheers, Chris.
<<attachment: smime.p7s>>