On Mon, Jun 27, 2022 at 10:35:25AM -0400, Derrick Stolee wrote: > On 6/26/2022 9:10 AM, Abhradeep Chakraborty via GitGitGadget wrote: > > > + uint32_t a = positions[*(uint32_t *)_va]; > > + uint32_t b = positions[*(uint32_t *)_vb]; > > + > > + if (a > b) > > + result = 1; > > + else if (a < b) > > + result = -1; > > + else > > + result = 0; > > + > > + return result; > > +} > > Ok, here you are sorting by commit OID (indirectly by the order in the > [multi-]pack-index). I suppose that I misunderstood in the previous > patch, so that could use some more specific language, maybe. Yeah, I agree that some more specific language could be used, with the main idea being there that we make it clearer that the list of tuples is still sorted (and can be binary searched). > > + for (i = 0; i < writer.selected_nr; i++) > > + table[i] = i; > > + > > + QSORT_S(table, writer.selected_nr, table_cmp, commit_positions); > > At the end of this sort, table[j] = i means that the ith bitmap corresponds > to the jth bitmapped commit in lex order of OIDs. > > > + for (i = 0; i < writer.selected_nr; i++) > > + table_inv[table[i]] = i; > > And table_inv helps us discover that relationship (ith bitmap to jth commit > by j = table_inv[i]). These are both great descriptions and should give an idea of what sort of information is worth putting into a comment. > > > + for (i = 0; i < writer.selected_nr; i++) { > > + struct bitmapped_commit *selected = &writer.selected[table[i]]; > > + uint32_t xor_offset = selected->xor_offset; > > Here, xor_offset is "number of bitmaps in relationship to the current bitmap" It's an offset to an earlier commit which must be used to XOR-decompress the current one (if any). > > + hashwrite_be32(f, commit_positions[table[i]]); > > + hashwrite_be64(f, offsets[table[i]]); > > + hashwrite_be32(f, xor_offset ? > > + table_inv[table[i] - xor_offset]: 0xffffffff); > > Which means that if "k = table[i] - xor_offset" that the xor base is the kth > bitmap. table_inv[k] gets us the position in this table of that bitmap's > commit. Yes, exactly. Abhradeep: this is also worth commenting ;-). > (It's also strange to me that the offset is being _subtracted_, but I guess > the bitmap format requires the xor base to appear first so the offset does > not need to be a negative number ever.) You're right, this follows from the fact that the XOR bases must come before the commits who must use them to decompress themselves. From Documentation/technical/bitmap-format.txt: This number is always positive, and hence entries are always xor'ed with **previous** bitmaps, not bitmaps that will come afterwards in the index. > This last line is a bit complex. > > uint32_t xor_offset = selected->xor_offset; > uint32_t xor_row = 0xffffffff; > > if (xor_offset) { > uint32_t xor_order = table[i] - xor_offset; > xor_row = table_inf[xor_order]; > } > > ...then we can "hashwrite_be32(f, xor_row);" when necessary. I'm not sure > that we need the "uint32_t xor_order" inside the "if (xor_offset)" block, > but splitting it helps add clarity to the multi-step computation. I had the same thought, though I would also say that xor_row should be declared, not initialized, and the "else" block of "if (xor_offset)" should set it to 0xffffffff to make the relationship between xor_offset and the value written a little clearer. Thanks, Taylor