There are some issues in the bitmap-format html page. For example, some nested lists are shown as top-level lists (e.g. [1]- Here BITMAP_OPT_FULL_DAG (0x1) and BITMAP_OPT_HASH_CACHE (0x4) are shown as top-level list). There is also a need of adding info about trailing checksum in the docs. Changes since v2: The last two commits are updated to address the suggestions. These changes are - * previously omitted blank lines are re-added. In the updated commit, use of <pre> blocks are decreased. Description lists and + are used instead to add more than one paragraphs under lists. Readability of the source text might decrease due to the use of +. But other documentation files (e.g. git-add.txt) also use it to connect two paragraphs. So, I hope this is acceptable. * Information about trailing checksum is updated (as suggested by Taylor) Changes since v1: * a new commit addressing bitmap-format.txt html page generation is added * Remove extra indentation from the previous change * elaborate more about the trailing checksum (as suggested by Kaartic) initial version: * first commit fixes some formatting issues * information about trailing checksum in the bitmap file is added in the bitmap-format doc. [1] https://git-scm.com/docs/bitmap-format#_on_disk_format Abhradeep Chakraborty (3): bitmap-format.txt: feed the file to asciidoc to generate html bitmap-format.txt: fix some formatting issues bitmap-format.txt: add information for trailing checksum Documentation/Makefile | 1 + Documentation/technical/bitmap-format.txt | 113 ++++++++++++---------- 2 files changed, 63 insertions(+), 51 deletions(-) base-commit: 2668e3608e47494f2f10ef2b6e69f08a84816bcb Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1246%2FAbhra303%2Ffix-doc-formatting-v3 Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1246/Abhra303/fix-doc-formatting-v3 Pull-Request: https://github.com/gitgitgadget/git/pull/1246 Range-diff vs v2: 1: a1b9bd9af90 = 1: a1b9bd9af90 bitmap-format.txt: feed the file to asciidoc to generate html 2: cb919513c14 ! 2: c74b9a52c2a bitmap-format.txt: fix some formatting issues @@ Commit message format.txt` is broken. This is mainly because `-` is used for nested lists (which is not allowed in asciidoc) instead of `*`. - Fix these and also reformat it (e.g. removing some blank lines) for - better readability of the html page. + Fix these and also reformat it for better readability of the html page. Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@xxxxxxxxx> @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cac - - A header appears at the beginning: + * A header appears at the beginning: - 4-byte signature: {'B', 'I', 'T', 'M'} +- 4-byte signature: {'B', 'I', 'T', 'M'} ++ 4-byte signature: :: {'B', 'I', 'T', 'M'} ++ ++ 2-byte version number (network byte order): :: -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. +- 2-byte version number (network byte order) + The current implementation only supports version 1 of the bitmap index (the same one as JGit). - 2-byte flags (network byte order) -- +- 2-byte flags (network byte order) ++ 2-byte flags (network byte order): :: + The following flags are supported: -- - - BITMAP_OPT_FULL_DAG (0x1) REQUIRED + +- - BITMAP_OPT_FULL_DAG (0x1) REQUIRED ++ ** {empty} ++ BITMAP_OPT_FULL_DAG (0x1) REQUIRED: ::: ++ This flag must always be present. It implies that the bitmap index has been generated for a packfile or + multi-pack index (MIDX) with full closure (i.e. where @@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. - requirement for the bitmap index format, also present in JGit, that greatly reduces the complexity of the implementation. -- - - BITMAP_OPT_HASH_CACHE (0x4) + +- - BITMAP_OPT_HASH_CACHE (0x4) ++ ** {empty} ++ BITMAP_OPT_HASH_CACHE (0x4): ::: ++ If present, the end of the bitmap file contains `N` 32-bit name-hash values, one per object in the -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. + pack/MIDX. The format and meaning of the name-hash is described below. - 4-byte entry count (network byte order) +- 4-byte entry count (network byte order) - ++ 4-byte entry count (network byte order): :: The total count of entries (bitmapped commits) in this bitmap index. - 20-byte checksum +- 20-byte checksum - ++ 20-byte checksum: :: The SHA1 checksum of the pack/MIDX this bitmap index belongs to. - - 4 EWAH bitmaps that act as type indexes -+ * 4 EWAH bitmaps that act as type indexes - - Type indexes are serialized after the hash cache in the shape - of four EWAH bitmaps stored consecutively (see Appendix A for -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. - - There is a bitmap for each Git object type, stored in the following - order: - - - Commits - - Trees - - Blobs -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. - in a full set (all bits set), and the AND of all 4 bitmaps will - result in an empty bitmap (no bits set). - +- Type indexes are serialized after the hash cache in the shape +- of four EWAH bitmaps stored consecutively (see Appendix A for +- the serialization format of an EWAH bitmap). +- +- There is a bitmap for each Git object type, stored in the following +- order: +- +- - Commits +- - Trees +- - Blobs +- - Tags +- +- In each bitmap, the `n`th bit is set to true if the `n`th object +- in the packfile or multi-pack index is of that type. +- +- The obvious consequence is that the OR of all 4 bitmaps will result +- in a full set (all bits set), and the AND of all 4 bitmaps will +- result in an empty bitmap (no bits set). +- - - N entries with compressed bitmaps, one for each indexed commit -+ * N entries with compressed bitmaps, one for each indexed commit - - Where `N` is the total amount of entries in this bitmap index. - Each entry contains the following: - +- +- Where `N` is the total amount of entries in this bitmap index. +- Each entry contains the following: +- - - 4-byte object position (network byte order) -+ ** 4-byte object position (network byte order) ++ * 4 EWAH bitmaps that act as type indexes +++ ++Type indexes are serialized after the hash cache in the shape ++of four EWAH bitmaps stored consecutively (see Appendix A for ++the serialization format of an EWAH bitmap). +++ ++There is a bitmap for each Git object type, stored in the following ++order: +++ ++ - Commits ++ - Trees ++ - Blobs ++ - Tags ++ +++ ++In each bitmap, the `n`th bit is set to true if the `n`th object ++in the packfile or multi-pack index is of that type. ++ ++ The obvious consequence is that the OR of all 4 bitmaps will result ++ in a full set (all bits set), and the AND of all 4 bitmaps will ++ result in an empty bitmap (no bits set). ++ ++ * N entries with compressed bitmaps, one for each indexed commit +++ ++Where `N` is the total amount of entries in this bitmap index. ++Each entry contains the following: ++ ++ ** {empty} ++ 4-byte object position (network byte order): :: The position **in the index for the packfile or multi-pack index** where the bitmap for this commit is found. - - 1-byte XOR-offset -+ ** 1-byte XOR-offset ++ ** {empty} ++ 1-byte XOR-offset: :: The xor offset used to compress this bitmap. For an entry in position `x`, a XOR offset of `y` means that the actual bitmap representing this commit is composed by XORing the -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. - with **previous** bitmaps, not bitmaps that will come afterwards - in the index. - + bitmap for this entry with the bitmap in entry `x-y` (i.e. + the bitmap `y` entries before this one). +- +- Note that this compression can be recursive. In order to +- XOR this entry with a previous one, the previous entry needs +- to be decompressed first, and so on. +- +- The hard-limit for this offset is 160 (an entry can only be +- xor'ed against one of the 160 entries preceding it). This +- number is always positive, and hence entries are always xor'ed +- with **previous** bitmaps, not bitmaps that will come afterwards +- in the index. +- - - 1-byte flags for this bitmap -+ ** 1-byte flags for this bitmap +++ ++NOTE: This compression can be recursive. In order to ++XOR this entry with a previous one, the previous entry needs ++to be decompressed first, and so on. +++ ++The hard-limit for this offset is 160 (an entry can only be ++xor'ed against one of the 160 entries preceding it). This ++number is always positive, and hence entries are always xor'ed ++with **previous** bitmaps, not bitmaps that will come afterwards ++in the index. ++ ++ ** {empty} ++ 1-byte flags for this bitmap: :: At the moment the only available flag is `0x1`, which hints that this bitmap can be re-used when rebuilding bitmap indexes for the repository. 3: 2171d31fb2b ! 3: b971558e1cb bitmap-format.txt: add information for trailing checksum @@ Commit message Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@xxxxxxxxx> ## Documentation/technical/bitmap-format.txt ## -@@ Documentation/technical/bitmap-format.txt: MIDXs, both the bit-cache and rev-cache extensions are required. +@@ Documentation/technical/bitmap-format.txt: in the index. ** The compressed bitmap itself, see Appendix A. -+ * TRAILER: -+ -+ Index checksum of the above contents. It is a 20-byte SHA1 checksum. ++ * {empty} ++ TRAILER: :: ++ Trailing checksum of the preceding contents. + == Appendix A: Serialization format for an EWAH bitmap -- gitgitgadget