There are some issues in the bitmap-format html page. For example, some nested lists are shown as top-level lists (e.g. [1]- Here BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as top-level list). There is also a need of adding info about trailing checksum in the docs. Changes since v3: * spaces are used instead of tabs * fixed remaining <pre> blocks Changes since v2: The last two commits are updated to address the suggestions. These changes are - * previously omitted blank lines are re-added. In the updated commit, use of <pre> blocks are decreased. Description lists and + are used instead to add more than one paragraphs under lists. Readability of the source text might decrease due to the use of +. But other documentation files (e.g. git-add.txt) also use it to connect two paragraphs. So, I hope this is acceptable. * Information about trailing checksum is updated (as suggested by Taylor) Changes since v1: * a new commit addressing bitmap-format.txt html page generation is added * Remove extra indentation from the previous change * elaborate more about the trailing checksum (as suggested by Kaartic) initial version: * first commit fixes some formatting issues * information about trailing checksum in the bitmap file is added in the bitmap-format doc. [1] https://git-scm.com/docs/bitmap-format#_on_disk_format Abhradeep Chakraborty (3): bitmap-format.txt: feed the file to asciidoc to generate html bitmap-format.txt: fix some formatting issues bitmap-format.txt: add information for trailing checksum Documentation/Makefile | 1 + Documentation/technical/bitmap-format.txt | 203 ++++++++++++---------- 2 files changed, 108 insertions(+), 96 deletions(-) base-commit: 5699ec1b0aec51b9e9ba5a2785f65970c5a95d84 Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1246%2FAbhra303%2Ffix-doc-formatting-v4 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1246/Abhra303/fix-doc-formatting-v4 Pull-Request: https://github.com/gitgitgadget/git/pull/1246 Range-diff vs v3: 1: a1b9bd9af90 ! 1: 494c1c1bd52 bitmap-format.txt: feed the file to asciidoc to generate html @@ Documentation/Makefile: TECH_DOCS += MyFirstContribution TECH_DOCS += ToolsForGit +TECH_DOCS += technical/bitmap-format TECH_DOCS += technical/bundle-format + TECH_DOCS += technical/cruft-packs TECH_DOCS += technical/hash-function-transition - TECH_DOCS += technical/http-protocol 2: c74b9a52c2a ! 2: 25512aa9c5b bitmap-format.txt: fix some formatting issues @@ Commit message Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@xxxxxxxxx> ## Documentation/technical/bitmap-format.txt ## +@@ Documentation/technical/bitmap-format.txt: An object is uniquely described by its bit position within a bitmap: + is defined as follows: + + o1 <= o2 <==> pack(o1) <= pack(o2) /\ offset(o1) <= offset(o2) +- +- The ordering between packs is done according to the MIDX's .rev file. +- Notably, the preferred pack sorts ahead of all other packs. +++ ++The ordering between packs is done according to the MIDX's .rev file. ++Notably, the preferred pack sorts ahead of all other packs. + + The on-disk representation (described below) of a bitmap is the same regardless + of whether or not that bitmap belongs to a packfile or a MIDX. The only @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. == On-disk format - - A header appears at the beginning: -+ * A header appears at the beginning: - +- - 4-byte signature: {'B', 'I', 'T', 'M'} -+ 4-byte signature: :: {'B', 'I', 'T', 'M'} -+ -+ 2-byte version number (network byte order): :: - +- - 2-byte version number (network byte order) - The current implementation only supports version 1 - of the bitmap index (the same one as JGit). - +- The current implementation only supports version 1 +- of the bitmap index (the same one as JGit). +- - 2-byte flags (network byte order) -+ 2-byte flags (network byte order): :: - - The following flags are supported: - +- +- The following flags are supported: +- - - BITMAP_OPT_FULL_DAG (0x1) REQUIRED -+ ** {empty} -+ BITMAP_OPT_FULL_DAG (0x1) REQUIRED: ::: -+ - This flag must always be present. It implies that the - bitmap index has been generated for a packfile or - multi-pack index (MIDX) with full closure (i.e. where -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. - JGit, that greatly reduces the complexity of the - implementation. - +- This flag must always be present. It implies that the +- bitmap index has been generated for a packfile or +- multi-pack index (MIDX) with full closure (i.e. where +- every single object in the packfile/MIDX can find its +- parent links inside the same packfile/MIDX). This is a +- requirement for the bitmap index format, also present in +- JGit, that greatly reduces the complexity of the +- implementation. +- - - BITMAP_OPT_HASH_CACHE (0x4) -+ ** {empty} -+ BITMAP_OPT_HASH_CACHE (0x4): ::: -+ - If present, the end of the bitmap file contains - `N` 32-bit name-hash values, one per object in the - pack/MIDX. The format and meaning of the name-hash is - described below. - +- If present, the end of the bitmap file contains +- `N` 32-bit name-hash values, one per object in the +- pack/MIDX. The format and meaning of the name-hash is +- described below. +- - 4-byte entry count (network byte order) - -+ 4-byte entry count (network byte order): :: - The total count of entries (bitmapped commits) in this bitmap index. - +- The total count of entries (bitmapped commits) in this bitmap index. +- - 20-byte checksum - -+ 20-byte checksum: :: - The SHA1 checksum of the pack/MIDX this bitmap index - belongs to. - +- The SHA1 checksum of the pack/MIDX this bitmap index +- belongs to. +- - - 4 EWAH bitmaps that act as type indexes - - Type indexes are serialized after the hash cache in the shape @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac - Each entry contains the following: - - - 4-byte object position (network byte order) -+ * 4 EWAH bitmaps that act as type indexes +- The position **in the index for the packfile or +- multi-pack index** where the bitmap for this commit is +- found. +- +- - 1-byte XOR-offset +- The xor offset used to compress this bitmap. For an entry +- in position `x`, a XOR offset of `y` means that the actual +- bitmap representing this commit is composed by XORing the +- bitmap for this entry with the bitmap in entry `x-y` (i.e. +- the bitmap `y` entries before this one). +- +- Note that this compression can be recursive. In order to +- XOR this entry with a previous one, the previous entry needs +- to be decompressed first, and so on. +- +- The hard-limit for this offset is 160 (an entry can only be +- xor'ed against one of the 160 entries preceding it). This +- number is always positive, and hence entries are always xor'ed +- with **previous** bitmaps, not bitmaps that will come afterwards +- in the index. +- +- - 1-byte flags for this bitmap +- At the moment the only available flag is `0x1`, which hints +- that this bitmap can be re-used when rebuilding bitmap indexes +- for the repository. +- +- - The compressed bitmap itself, see Appendix A. ++ * A header appears at the beginning: ++ ++ 4-byte signature: :: {'B', 'I', 'T', 'M'} ++ ++ 2-byte version number (network byte order): :: ++ ++ The current implementation only supports version 1 ++ of the bitmap index (the same one as JGit). ++ ++ 2-byte flags (network byte order): :: ++ ++ The following flags are supported: ++ ++ ** {empty} ++ BITMAP_OPT_FULL_DAG (0x1) REQUIRED: ::: ++ ++ This flag must always be present. It implies that the ++ bitmap index has been generated for a packfile or ++ multi-pack index (MIDX) with full closure (i.e. where ++ every single object in the packfile/MIDX can find its ++ parent links inside the same packfile/MIDX). This is a ++ requirement for the bitmap index format, also present in ++ JGit, that greatly reduces the complexity of the ++ implementation. ++ ++ ** {empty} ++ BITMAP_OPT_HASH_CACHE (0x4): ::: ++ ++ If present, the end of the bitmap file contains ++ `N` 32-bit name-hash values, one per object in the ++ pack/MIDX. The format and meaning of the name-hash is ++ described below. ++ ++ 4-byte entry count (network byte order): :: ++ The total count of entries (bitmapped commits) in this bitmap index. ++ ++ 20-byte checksum: :: ++ The SHA1 checksum of the pack/MIDX this bitmap index ++ belongs to. ++ ++ * 4 EWAH bitmaps that act as type indexes ++ +Type indexes are serialized after the hash cache in the shape +of four EWAH bitmaps stored consecutively (see Appendix A for @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac +There is a bitmap for each Git object type, stored in the following +order: ++ -+ - Commits -+ - Trees -+ - Blobs -+ - Tags ++ - Commits ++ - Trees ++ - Blobs ++ - Tags + ++ +In each bitmap, the `n`th bit is set to true if the `n`th object +in the packfile or multi-pack index is of that type. +++ ++The obvious consequence is that the OR of all 4 bitmaps will result ++in a full set (all bits set), and the AND of all 4 bitmaps will ++result in an empty bitmap (no bits set). + -+ The obvious consequence is that the OR of all 4 bitmaps will result -+ in a full set (all bits set), and the AND of all 4 bitmaps will -+ result in an empty bitmap (no bits set). -+ -+ * N entries with compressed bitmaps, one for each indexed commit ++ * N entries with compressed bitmaps, one for each indexed commit ++ +Where `N` is the total amount of entries in this bitmap index. +Each entry contains the following: + -+ ** {empty} -+ 4-byte object position (network byte order): :: - The position **in the index for the packfile or - multi-pack index** where the bitmap for this commit is - found. - -- - 1-byte XOR-offset -+ ** {empty} -+ 1-byte XOR-offset: :: - The xor offset used to compress this bitmap. For an entry - in position `x`, a XOR offset of `y` means that the actual - bitmap representing this commit is composed by XORing the - bitmap for this entry with the bitmap in entry `x-y` (i.e. - the bitmap `y` entries before this one). -- -- Note that this compression can be recursive. In order to -- XOR this entry with a previous one, the previous entry needs -- to be decompressed first, and so on. -- -- The hard-limit for this offset is 160 (an entry can only be -- xor'ed against one of the 160 entries preceding it). This -- number is always positive, and hence entries are always xor'ed -- with **previous** bitmaps, not bitmaps that will come afterwards -- in the index. -- -- - 1-byte flags for this bitmap ++ ** {empty} ++ 4-byte object position (network byte order): :: ++ The position **in the index for the packfile or ++ multi-pack index** where the bitmap for this commit is ++ found. ++ ++ ** {empty} ++ 1-byte XOR-offset: :: ++ The xor offset used to compress this bitmap. For an entry ++ in position `x`, a XOR offset of `y` means that the actual ++ bitmap representing this commit is composed by XORing the ++ bitmap for this entry with the bitmap in entry `x-y` (i.e. ++ the bitmap `y` entries before this one). ++ +NOTE: This compression can be recursive. In order to +XOR this entry with a previous one, the previous entry needs @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac +with **previous** bitmaps, not bitmaps that will come afterwards +in the index. + -+ ** {empty} -+ 1-byte flags for this bitmap: :: - At the moment the only available flag is `0x1`, which hints - that this bitmap can be re-used when rebuilding bitmap indexes - for the repository. - -- - The compressed bitmap itself, see Appendix A. -+ ** The compressed bitmap itself, see Appendix A. ++ ** {empty} ++ 1-byte flags for this bitmap: :: ++ At the moment the only available flag is `0x1`, which hints ++ that this bitmap can be re-used when rebuilding bitmap indexes ++ for the repository. ++ ++ ** The compressed bitmap itself, see Appendix A. == Appendix A: Serialization format for an EWAH bitmap +@@ Documentation/technical/bitmap-format.txt: implementation: + - 4-byte number of words of the COMPRESSED bitmap, when stored + + - N x 8-byte words, as specified by the previous field +- +- This is the actual content of the compressed bitmap. +++ ++This is the actual content of the compressed bitmap. + + - 4-byte position of the current RLW for the compressed + bitmap 3: b971558e1cb ! 3: dbb86dca205 bitmap-format.txt: add information for trailing checksum @@ Commit message ## Documentation/technical/bitmap-format.txt ## @@ Documentation/technical/bitmap-format.txt: in the index. - ** The compressed bitmap itself, see Appendix A. + ** The compressed bitmap itself, see Appendix A. + * {empty} + TRAILER: :: -- gitgitgadget