Re: [PATCH v4 11/13] pack-bitmap.c: keep track of each layer's type bitmaps

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 17, 2025 at 10:01:13PM -0400, Jeff King wrote:
> On Fri, Mar 14, 2025 at 04:18:53PM -0400, Taylor Blau wrote:
>
> > @@ -81,6 +81,24 @@ struct bitmap_index {
> >  	struct ewah_bitmap *blobs;
> >  	struct ewah_bitmap *tags;
> >
> > +	/*
> > +	 * Type index arrays when this bitmap is associated with an
> > +	 * incremental multi-pack index chain.
> > +	 *
> > +	 * If n is the number of unique layers in the MIDX chain, then
> > +	 * commits_all[n-1] is this structs 'commits' field,
> > +	 * commits_all[n-2] is the commits field of this bitmap's
> > +	 * 'base', and so on.
> > +	 *
> > +	 * When either associated either with a non-incremental MIDX, or
> > +	 * a single packfile, these arrays each contain a single
> > +	 * element.
> > +	 */
> > +	struct ewah_bitmap **commits_all;
> > +	struct ewah_bitmap **trees_all;
> > +	struct ewah_bitmap **blobs_all;
> > +	struct ewah_bitmap **tags_all;
>
> OK, so these are valid only for the top-level of the chain? I guess
> there would not be much point in having the lower levels know about
> their incremental versions.

Right; all of the "useful" computation like counting, traversing,
filtering, etc. is all done at the top-most layer in any chain, so these
have no meaning/purpose for lower layers.

> > -static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git)
> > +static void load_all_type_bitmaps(struct bitmap_index *bitmap_git)
> > +{
> > +	struct bitmap_index *curr = bitmap_git;
> > +	size_t i = bitmap_git->base_nr;
> > +
> > +	ALLOC_ARRAY(bitmap_git->commits_all, bitmap_git->base_nr + 1);
> > +	ALLOC_ARRAY(bitmap_git->trees_all, bitmap_git->base_nr + 1);
> > +	ALLOC_ARRAY(bitmap_git->blobs_all, bitmap_git->base_nr + 1);
> > +	ALLOC_ARRAY(bitmap_git->tags_all, bitmap_git->base_nr + 1);
> > +
> > +	while (curr) {
> > +		bitmap_git->commits_all[i] = curr->commits;
> > +		bitmap_git->trees_all[i] = curr->trees;
> > +		bitmap_git->blobs_all[i] = curr->blobs;
> > +		bitmap_git->tags_all[i] = curr->tags;
> > +
> > +		curr = curr->base;
> > +		if (curr && !i)
> > +			BUG("unexpected number of bitmap layers, expected %"PRIu32,
> > +			    bitmap_git->base_nr + 1);
> > +		i -= 1;
> > +	}
> > +}
>
> It looks like we always allocate these. For the non-incremental case, I
> think you could just do:
>
>   bitmap_git->commits_all = &bitmap_git->commits;
>
> and so forth. But I doubt that micro-optimization really matters, and it
> introduces complications when you have to decide whether to free them or
> not.

The complications aren't actually too bad... I think you just have to
avoid free()-ing them when you have a non-NULL 'base' pointer. I think
it would look something like:

--- 8< ---
diff --git a/pack-bitmap.c b/pack-bitmap.c
index 2270a646f6..8a530fa7d8 100644
--- a/pack-bitmap.c
+++ b/pack-bitmap.c
@@ -604,6 +604,15 @@ static void load_all_type_bitmaps(struct bitmap_index *bitmap_git)
 	struct bitmap_index *curr = bitmap_git;
 	size_t i = bitmap_git->base_nr;

+	if (!bitmap_git->base) {
+		bitmap_git->commits_all = &bitmap_git->commits;
+		bitmap_git->trees_all = &bitmap_git->trees;
+		bitmap_git->blobs_all = &bitmap_git->blobs;
+		bitmap_git->tags_all = &bitmap_git->tags;
+
+		return;
+	}
+
 	ALLOC_ARRAY(bitmap_git->commits_all, bitmap_git->base_nr + 1);
 	ALLOC_ARRAY(bitmap_git->trees_all, bitmap_git->base_nr + 1);
 	ALLOC_ARRAY(bitmap_git->blobs_all, bitmap_git->base_nr + 1);
@@ -3031,10 +3040,12 @@ void free_bitmap_index(struct bitmap_index *b)
 	ewah_pool_free(b->trees);
 	ewah_pool_free(b->blobs);
 	ewah_pool_free(b->tags);
-	free(b->commits_all);
-	free(b->trees_all);
-	free(b->blobs_all);
-	free(b->tags_all);
+	if (b->base) {
+		free(b->commits_all);
+		free(b->trees_all);
+		free(b->blobs_all);
+		free(b->tags_all);
+	}
 	if (b->bitmaps) {
 		struct stored_bitmap *sb;
 		kh_foreach_value(b->bitmaps, sb, {
--- >8 ---

, but it leaves a bad taste in my mouth tying the NULL-ness of
'bitmap_git->base' to the allocation/freeing of these arrays. Maybe I'm
being paranoid, but it feels like a potential landmine.

> (And if you really cared about micro-optimizing, probably trying to
> prevent the extra pointer-chase in the first place would be a more
> productive path).

Yup.

Thanks,
Taylor




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux