Re: [PATCH v2 06/23] pack-bitmap-write: support storing pseudo-merge commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 13, 2024 at 02:42:46PM -0400, Jeff King wrote:
> On Mon, Apr 29, 2024 at 04:43:15PM -0400, Taylor Blau wrote:
>
> > diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
> > index 9bc41a9e145..fef02cd745a 100644
> > --- a/pack-bitmap-write.c
> > +++ b/pack-bitmap-write.c
> > @@ -24,7 +24,7 @@ struct bitmapped_commit {
> >  	struct ewah_bitmap *write_as;
> >  	int flags;
> >  	int xor_offset;
> > -	uint32_t commit_pos;
> > +	unsigned pseudo_merge : 1;
> >  };
>
> The addition of the bit flag here makes sense, but dropping commit_pos
> caught me by surprise. But...it looks like that flag is simply unused
> cruft even before this patch?
>
> It might be worth noting that in the commit message, or better still,
> pulling its removal out to a preparatory patch.

Hah, so this is a funny one :-).

I was following your suggestion to pull out the deletion into its own
patch[^1] and starting to dig out back-references to indicate why it was
safe to remove this field.

But the only reference to commit_pos is from 7cc8f971085 (pack-objects:
implement bitmap writing, 2013-12-21), which is the commit that added
this field in the first place. Looking at:

    $ git log -p -S commit_pos 7cc8f971085 -- pack-bitmap-write.c

doesn't really show us anything interesting, either.

But! There is an array called commit_positions, which I suspected was
for holding the values of commit_pos in the same order as they appear in
the writer.selected array.

So I think the right patch is something like this (which I'll put in the
next round of this series):

--- 8< ---
Subject: [PATCH] pack-bitmap-write.c: move commit_positions into commit_pos
 fields

In 7cc8f971085 (pack-objects: implement bitmap writing, 2013-12-21), the
bitmapped_commit struct was introduced, including the 'commit_pos'
field, which has been unused ever since its introduction more than a
decade ago.

Instead, we have used the nearby `commit_positions` array leaving the
bitmapped_commit struct with an unused 4-byte field.

We could drop the `commit_pos` field as unused, and continue to store
the values in the auxiliary array. But we could also drop the array and
store the data for each bitmapped_commit struct inside of the structure
itself, which is what this patch does.

In any spot that we previously read `commit_positions[i]`, we can now
instead read `writer.selected[i].commit_pos`. There are a few spots that
need changing as a result:

  - write_selected_commits_v1() is a simple transformation, since we're
    just reading the field. As a result, the function no longer needs an
    explicit argument to pass the commit_positions array.

  - write_lookup_table() also no longer needs the explicit
    commit_positions array passed in as an argument. But it still needs
    to sort an array of indices into the writer.selected array to read
    them in commit_pos order, so table_cmp() is adjusted accordingly.

  - bitmap_writer_finish() no longer needs to allocate, populate, and
    free the commit_positions table. Instead, we can just write the data
    directly into each struct bitmapped_commit.

Signed-off-by: Taylor Blau <me@xxxxxxxxxxxx>
---
 pack-bitmap-write.c | 42 ++++++++++++++++--------------------------
 1 file changed, 16 insertions(+), 26 deletions(-)

diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c
index 473a0fa0d40..26f57e48804 100644
--- a/pack-bitmap-write.c
+++ b/pack-bitmap-write.c
@@ -679,9 +679,7 @@ static const struct object_id *oid_access(size_t pos, const void *table)
 	return &index[pos]->oid;
 }

-static void write_selected_commits_v1(struct hashfile *f,
-				      uint32_t *commit_positions,
-				      off_t *offsets)
+static void write_selected_commits_v1(struct hashfile *f, off_t *offsets)
 {
 	int i;

@@ -691,7 +689,7 @@ static void write_selected_commits_v1(struct hashfile *f,
 		if (offsets)
 			offsets[i] = hashfile_total(f);

-		hashwrite_be32(f, commit_positions[i]);
+		hashwrite_be32(f, stored->commit_pos);
 		hashwrite_u8(f, stored->xor_offset);
 		hashwrite_u8(f, stored->flags);

@@ -699,23 +697,20 @@ static void write_selected_commits_v1(struct hashfile *f,
 	}
 }

-static int table_cmp(const void *_va, const void *_vb, void *_data)
+static int table_cmp(const void *_va, const void *_vb)
 {
-	uint32_t *commit_positions = _data;
-	uint32_t a = commit_positions[*(uint32_t *)_va];
-	uint32_t b = commit_positions[*(uint32_t *)_vb];
+	struct bitmapped_commit *a = &writer.selected[*(uint32_t *)_va];
+	struct bitmapped_commit *b = &writer.selected[*(uint32_t *)_vb];

-	if (a > b)
+	if (a->commit_pos < b->commit_pos)
+		return -1;
+	else if (a->commit_pos > b->commit_pos)
 		return 1;
-	else if (a < b)
-		return -1;

 	return 0;
 }

-static void write_lookup_table(struct hashfile *f,
-			       uint32_t *commit_positions,
-			       off_t *offsets)
+static void write_lookup_table(struct hashfile *f, off_t *offsets)
 {
 	uint32_t i;
 	uint32_t *table, *table_inv;
@@ -731,7 +726,7 @@ static void write_lookup_table(struct hashfile *f,
 	 * bitmap corresponds to j'th bitmapped commit (among the selected
 	 * commits) in lex order of OIDs.
 	 */
-	QSORT_S(table, writer.selected_nr, table_cmp, commit_positions);
+	QSORT(table, writer.selected_nr, table_cmp);

 	/* table_inv helps us discover that relationship (i'th bitmap
 	 * to j'th commit by j = table_inv[i])
@@ -762,7 +757,7 @@ static void write_lookup_table(struct hashfile *f,
 			xor_row = 0xffffffff;
 		}

-		hashwrite_be32(f, commit_positions[table[i]]);
+		hashwrite_be32(f, writer.selected[table[i]].commit_pos);
 		hashwrite_be64(f, (uint64_t)offsets[table[i]]);
 		hashwrite_be32(f, xor_row);
 	}
@@ -798,7 +793,6 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 	static uint16_t flags = BITMAP_OPT_FULL_DAG;
 	struct strbuf tmp_file = STRBUF_INIT;
 	struct hashfile *f;
-	uint32_t *commit_positions = NULL;
 	off_t *offsets = NULL;
 	uint32_t i;

@@ -823,22 +817,19 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 	if (options & BITMAP_OPT_LOOKUP_TABLE)
 		CALLOC_ARRAY(offsets, index_nr);

-	ALLOC_ARRAY(commit_positions, writer.selected_nr);
-
 	for (i = 0; i < writer.selected_nr; i++) {
 		struct bitmapped_commit *stored = &writer.selected[i];
-		int commit_pos = oid_pos(&stored->commit->object.oid, index, index_nr, oid_access);
+		stored->commit_pos = oid_pos(&stored->commit->object.oid, index,
+					     index_nr, oid_access);

-		if (commit_pos < 0)
+		if (stored->commit_pos < 0)
 			BUG(_("trying to write commit not in index"));
-
-		commit_positions[i] = commit_pos;
 	}

-	write_selected_commits_v1(f, commit_positions, offsets);
+	write_selected_commits_v1(f, offsets);

 	if (options & BITMAP_OPT_LOOKUP_TABLE)
-		write_lookup_table(f, commit_positions, offsets);
+		write_lookup_table(f, offsets);

 	if (options & BITMAP_OPT_HASH_CACHE)
 		write_hash_cache(f, index, index_nr);
@@ -853,6 +844,5 @@ void bitmap_writer_finish(struct pack_idx_entry **index,
 		die_errno("unable to rename temporary bitmap file to '%s'", filename);

 	strbuf_release(&tmp_file);
-	free(commit_positions);
 	free(offsets);
 }

--
2.45.0.57.gee4186f79f3

--- >8 ---

> > +static inline int bitmap_writer_selected_nr(void)
> > +{
> > +	return writer.selected_nr - writer.pseudo_merges_nr;
> > +}
>
> OK, so now most spots should use this new function instead of looking at
> writer.selected_nr directly. But if anybody accidentally uses the old
> field directly, it is presumably disastrous. Is it worth renaming it to
> make sure we caught all references?

We only need to check within this file, since the bitmap_writer
structure definition is defined within the pack-bitmap-writer.c
compilation unit.

I took a careful look through the file, and am confident that we touched
all of the spots that needed attention.

Thanks,
Taylor

[^1]: If memory serves, that was my original intention when writing this
  series for the first time, but I must have forgotten when I was
  actually splitting out the individual patches and staged the removal
  alongside the rest of this change.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux