Re: [PATCH 2/2 v7] pack-objects: use reachability bitmap index when generating non-stdout pack

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Aug 09, 2016 at 10:32:17PM +0300, Kirill Smelkov wrote:

> Subject: Re: [PATCH 2/2 v7] pack-objects: use reachability bitmap index when
>    generating non-stdout pack

This is v7, but as I understand your numbering, it goes with v5 of patch
1/2 that I just reviewed (usually we just increment the version number
on the whole series and treat it as a unit, even if some patches didn't
change from version to version).

> So we can teach pack-objects to use bitmap index for initial object
> counting phase when generating resultant pack file too:
> 
> - if we care it is not activated under git-repack:

Do you mean "if we take care that it is not..." here?

(I think you might just be getting tripped up in the English idioms;
"care" means that we have a preference; "to take care" means that we are
being careful).

> - if we know bitmap index generation is not enabled for resultant pack:
> 
>   Current code has singleton bitmap_git so cannot work simultaneously
>   with two bitmap indices.

Minor English fixes:

  The current code has a singleton bitmap_git, so it cannot work
  simultaneously with two bitmap indices.

> - if we keep pack reuse enabled still only for "send-to-stdout" case:
> 
>   Because on pack reuse raw entries are directly written out to destination
>   pack by write_reused_pack() bypassing needed for pack index generation
>   bookkeeping done by regular codepath in write_one() and friends.

Ditto on English:

  On pack reuse raw entries are directly written out to the destination
  pack by write_reused_pack(), bypassing the need for pack index
  generation bookkeeping done by the regular code path in write_one()
  and friends.

I think this is missing the implication. Why wouldn't we want to reuse
in this case? Certainly we don't when doing a "careful" on-disk repack.
I suspect the answer is that we cannot write a ".idx" off of the result
of write_reused_pack(), and write-to-disk always includes the .idx.

> More context:
> 
>     http://marc.info/?t=146792101400001&r=1&w=2

Can we turn this into a link to public-inbox? We have just been bit by
all of our old links to gmane dying, and they cannot easily be replaced
because they use a gmane-specific article number. public-inbox URLs use
message-ids, which should be usable for other archives if public-inbox
goes away.

> diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
> index b1007f2..c92d7fc 100644
> --- a/builtin/pack-objects.c
> +++ b/builtin/pack-objects.c

The code here looks fine.

> diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh
> index a278d30..9602e9a 100755
> --- a/t/t5310-pack-bitmaps.sh
> +++ b/t/t5310-pack-bitmaps.sh
> @@ -196,6 +196,18 @@ test_expect_success 'pack-objects respects --local (non-local bitmapped pack)' '
>  	! has_any packbitmap.objects 3b.objects
>  '
>  
> +test_expect_success 'pack-objects to file can use bitmap' '
> +	# make sure we still have 1 bitmap index from previous tests
> +	ls .git/objects/pack/ | grep bitmap >output &&
> +	test_line_count = 1 output &&
> +	# verify equivalent packs are generated with/without using bitmap index
> +	packasha1=$(git pack-objects --no-use-bitmap-index --all packa </dev/null) &&
> +	packbsha1=$(git pack-objects --use-bitmap-index --all packb </dev/null) &&
> +	list_packed_objects <packa-$packasha1.idx >packa.objects &&
> +	list_packed_objects <packb-$packbsha1.idx >packb.objects &&
> +	test_cmp packa.objects packb.objects
> +'

Of course we can't know if bitmaps were actually used, or if they were
turned off under the hood. But at least this exercises the code a bit.

You could possibly add a perf test which shows off the improvement, but
I don't think it's strictly necessary.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]