On Mon, May 13, 2024 at 02:42:46PM -0400, Jeff King wrote: > On Mon, Apr 29, 2024 at 04:43:15PM -0400, Taylor Blau wrote: > > > diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c > > index 9bc41a9e145..fef02cd745a 100644 > > --- a/pack-bitmap-write.c > > +++ b/pack-bitmap-write.c > > @@ -24,7 +24,7 @@ struct bitmapped_commit { > > struct ewah_bitmap *write_as; > > int flags; > > int xor_offset; > > - uint32_t commit_pos; > > + unsigned pseudo_merge : 1; > > }; > > The addition of the bit flag here makes sense, but dropping commit_pos > caught me by surprise. But...it looks like that flag is simply unused > cruft even before this patch? > > It might be worth noting that in the commit message, or better still, > pulling its removal out to a preparatory patch. Hah, so this is a funny one :-). I was following your suggestion to pull out the deletion into its own patch[^1] and starting to dig out back-references to indicate why it was safe to remove this field. But the only reference to commit_pos is from 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), which is the commit that added this field in the first place. Looking at: $ git log -p -S commit_pos 7cc8f971085 -- pack-bitmap-write.c doesn't really show us anything interesting, either. But! There is an array called commit_positions, which I suspected was for holding the values of commit_pos in the same order as they appear in the writer.selected array. So I think the right patch is something like this (which I'll put in the next round of this series): --- 8< --- Subject: [PATCH] pack-bitmap-write.c: move commit_positions into commit_pos fields In 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), the bitmapped_commit struct was introduced, including the 'commit_pos' field, which has been unused ever since its introduction more than a decade ago. Instead, we have used the nearby `commit_positions` array leaving the bitmapped_commit struct with an unused 4-byte field. We could drop the `commit_pos` field as unused, and continue to store the values in the auxiliary array. But we could also drop the array and store the data for each bitmapped_commit struct inside of the structure itself, which is what this patch does. In any spot that we previously read `commit_positions[i]`, we can now instead read `writer.selected[i].commit_pos`. There are a few spots that need changing as a result: - write_selected_commits_v1() is a simple transformation, since we're just reading the field. As a result, the function no longer needs an explicit argument to pass the commit_positions array. - write_lookup_table() also no longer needs the explicit commit_positions array passed in as an argument. But it still needs to sort an array of indices into the writer.selected array to read them in commit_pos order, so table_cmp() is adjusted accordingly. - bitmap_writer_finish() no longer needs to allocate, populate, and free the commit_positions table. Instead, we can just write the data directly into each struct bitmapped_commit. Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx> --- pack-bitmap-write.c | 42 ++++++++++++++++-------------------------- 1 file changed, 16 insertions(+), 26 deletions(-) diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 473a0fa0d40..26f57e48804 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -679,9 +679,7 @@ static const struct object_id *oid_access(size_t pos, const void *table) return &index[pos]->oid; } -static void write_selected_commits_v1(struct hashfile *f, - uint32_t *commit_positions, - off_t *offsets) +static void write_selected_commits_v1(struct hashfile *f, off_t *offsets) { int i; @@ -691,7 +689,7 @@ static void write_selected_commits_v1(struct hashfile *f, if (offsets) offsets[i] = hashfile_total(f); - hashwrite_be32(f, commit_positions[i]); + hashwrite_be32(f, stored->commit_pos); hashwrite_u8(f, stored->xor_offset); hashwrite_u8(f, stored->flags); @@ -699,23 +697,20 @@ static void write_selected_commits_v1(struct hashfile *f, } } -static int table_cmp(const void *_va, const void *_vb, void *_data) +static int table_cmp(const void *_va, const void *_vb) { - uint32_t *commit_positions = _data; - uint32_t a = commit_positions[*(uint32_t *)_va]; - uint32_t b = commit_positions[*(uint32_t *)_vb]; + struct bitmapped_commit *a = &writer.selected[*(uint32_t *)_va]; + struct bitmapped_commit *b = &writer.selected[*(uint32_t *)_vb]; - if (a > b) + if (a->commit_pos < b->commit_pos) + return -1; + else if (a->commit_pos > b->commit_pos) return 1; - else if (a < b) - return -1; return 0; } -static void write_lookup_table(struct hashfile *f, - uint32_t *commit_positions, - off_t *offsets) +static void write_lookup_table(struct hashfile *f, off_t *offsets) { uint32_t i; uint32_t *table, *table_inv; @@ -731,7 +726,7 @@ static void write_lookup_table(struct hashfile *f, * bitmap corresponds to j'th bitmapped commit (among the selected * commits) in lex order of OIDs. */ - QSORT_S(table, writer.selected_nr, table_cmp, commit_positions); + QSORT(table, writer.selected_nr, table_cmp); /* table_inv helps us discover that relationship (i'th bitmap * to j'th commit by j = table_inv[i]) @@ -762,7 +757,7 @@ static void write_lookup_table(struct hashfile *f, xor_row = 0xffffffff; } - hashwrite_be32(f, commit_positions[table[i]]); + hashwrite_be32(f, writer.selected[table[i]].commit_pos); hashwrite_be64(f, (uint64_t)offsets[table[i]]); hashwrite_be32(f, xor_row); } @@ -798,7 +793,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index, static uint16_t flags = BITMAP_OPT_FULL_DAG; struct strbuf tmp_file = STRBUF_INIT; struct hashfile *f; - uint32_t *commit_positions = NULL; off_t *offsets = NULL; uint32_t i; @@ -823,22 +817,19 @@ void bitmap_writer_finish(struct pack_idx_entry **index, if (options & BITMAP_OPT_LOOKUP_TABLE) CALLOC_ARRAY(offsets, index_nr); - ALLOC_ARRAY(commit_positions, writer.selected_nr); - for (i = 0; i < writer.selected_nr; i++) { struct bitmapped_commit *stored = &writer.selected[i]; - int commit_pos = oid_pos(&stored->commit->object.oid, index, index_nr, oid_access); + stored->commit_pos = oid_pos(&stored->commit->object.oid, index, + index_nr, oid_access); - if (commit_pos < 0) + if (stored->commit_pos < 0) BUG(_("trying to write commit not in index")); - - commit_positions[i] = commit_pos; } - write_selected_commits_v1(f, commit_positions, offsets); + write_selected_commits_v1(f, offsets); if (options & BITMAP_OPT_LOOKUP_TABLE) - write_lookup_table(f, commit_positions, offsets); + write_lookup_table(f, offsets); if (options & BITMAP_OPT_HASH_CACHE) write_hash_cache(f, index, index_nr); @@ -853,6 +844,5 @@ void bitmap_writer_finish(struct pack_idx_entry **index, die_errno("unable to rename temporary bitmap file to '%s'", filename); strbuf_release(&tmp_file); - free(commit_positions); free(offsets); } -- 2.45.0.57.gee4186f79f3 --- >8 --- > > +static inline int bitmap_writer_selected_nr(void) > > +{ > > + return writer.selected_nr - writer.pseudo_merges_nr; > > +} > > OK, so now most spots should use this new function instead of looking at > writer.selected_nr directly. But if anybody accidentally uses the old > field directly, it is presumably disastrous. Is it worth renaming it to > make sure we caught all references? We only need to check within this file, since the bitmap_writer structure definition is defined within the pack-bitmap-writer.c compilation unit. I took a careful look through the file, and am confident that we touched all of the spots that needed attention. Thanks, Taylor [^1]: If memory serves, that was my original intention when writing this series for the first time, but I must have forgotten when I was actually splitting out the individual patches and staged the removal alongside the rest of this change.