Re: [PATCH 12/12] builtin/commit-graph.c: introduce '--max-new-filters=<n>'

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/14/2020 4:12 PM, Taylor Blau wrote:
> On Fri, Sep 11, 2020 at 03:25:55PM -0400, Taylor Blau wrote:
>> On Fri, Sep 11, 2020 at 02:59:34PM -0400, Taylor Blau wrote:
>>> On Fri, Sep 11, 2020 at 01:52:16PM -0400, Jeff King wrote:
>>>> On Wed, Sep 09, 2020 at 11:24:00AM -0400, Taylor Blau wrote:
>>>>> +With the `--max-new-filters=<n>` option, generate at most `n` new Bloom
>>>>> +filters (if `--changed-paths` is specified). If `n` is `-1`, no limit is
>>>>> +enforced. Commits whose filters are not calculated are stored as a
>>>>> +length zero Bloom filter, and their bit is marked in the `BFXL` chunk.
>>>>> +Overrides the `commitGraph.maxNewFilters` configuration.
>>>>
>>>> The BFXL chunk doesn't exist anymore in this iteration, right?
>>>
>>> Ack; I'll have to drop that.
>>
>> Junio, I know that I've already sent one replacement patch. If you don't
>> mind, here's another (and if you do mind, I'm happy to re-roll the
>> series).
> 
> Just kidding. Let's use *this* version which fixes a bug reading the
> commitGraph.maxNewFilters configuration. At this point, the
> fix-ups are:
> 
>   - This patch (attached below the scisors) instead of 12/12, and
> 
>   - This [1] patch instead of 10/12.
> 
> [1]: https://lore.kernel.org/git/20200910154516.GA32117@nand.local/
> 
> Let me know if you'd rather have a full re-roll.

It's getting a bit difficult to track all of these "use this instead"
patches. But, I'm not the one applying them, so maybe that's not actually
a problem.

You might need a re-roll, anyway, as I have a few comments here:

> --- 8< ---
> 
> Subject: [PATCH] builtin/commit-graph.c: introduce '--max-new-filters=<n>'

You also introduce commitGraph.maxNewFitlers here, which is not
mentioned in the commit message anywhere. In fact, it might be
good to include it as a separate patch so its implementation and
tests can be isolated from the command-line functionality.

> +length zero Bloom filter. Overrides the `commitGraph.maxNewFilters`
> +configuration.

We have found it valuable to demonstrate these overrides in tests.
Let's inspect your tests for this.

> +test_bloom_filters_computed () {
> +	commit_graph_args=$1
> +	rm -f "$TRASH_DIRECTORY/trace.event" &&
> +	GIT_TRACE2_EVENT="$TRASH_DIRECTORY/trace.event" git commit-graph write \
> +		$commit_graph_args &&
> +	grep "\"filter_not_computed\":$2" "$TRASH_DIRECTORY/trace.event" &&
> +	grep "\"filter_trunc_large\":$3" "$TRASH_DIRECTORY/trace.event" &&
> +	grep "\"filter_computed\":$4" "$TRASH_DIRECTORY/trace.event"
> +}

If the arguments were moved to the last parameter, then we could do a few
interesting things here.

test_bloom_filters_computed () {
	NOT_COMPUTED="\"filter_not_computed\":$1" &&
	shift &&
	TRUNCATED="\"filter_trunc_large\":$1" &&
	shift &&
	COMPUTED="\"filter_computed\":$1" &&
	shift &&
	rm -f "$TRASH_DIRECTORY/trace.event" &&
	GIT_TRACE2_EVENT="$TRASH_DIRECTORY/trace.event" git commit-graph write $@ &&
	grep "$NOT_COMPUTED" "$TRASH_DIRECTORY/trace.event" &&
	grep "$TRUNCATED" "$TRASH_DIRECTORY/trace.event" &&
	grep "$COMPUTED" "$TRASH_DIRECTORY/trace.event"
}


(I have not tested this script. It might need some work.)
This would make your callers a bit cleaner-looking, for example:

test_expect_success 'Bloom generation is limited by --max-new-filters' '
	(
		cd limits &&
		test_commit c2 filter &&
		test_commit c3 filter &&
		test_commit c4 no-filter &&
		test_bloom_filters_computed 3 0 2 \
			--reachable --changed-paths --split=replace --max-new-filters=2
	)
'

At least, this looks nicer to me.

> +test_expect_success 'Bloom generation backfills previously-skipped filters' '
> +	# Check specifying commitGraph.maxNewFilters over "git config" works.
> +	test_config -C limits commitGraph.maxNewFilters 1 &&
> +	(
> +		cd limits &&
> +		test_bloom_filters_computed "--reachable --changed-paths --split=replace" \
> +			4 0 1
> +	)
> +'

Adding a case for `commitGraph.maxNewFilters=1` and `--max-new-filters=2` might
be interesting for the override rules.

> +
> +test_expect_success 'Bloom generation backfills empty commits' '
> +	git init empty &&
> +	test_when_finished "rm -fr empty" &&
> +	(
> +		cd empty &&
> +		for i in $(test_seq 1 6)
> +		do
> +			git commit --allow-empty -m "$i"
> +		done &&
> +
> +		# Generate Bloom filters for empty commits 1-6, two at a time.
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			4 0 2 &&
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			4 0 2 &&
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			4 0 2 &&

I'm concerned that the max-new-filters limit (2) is a divisor
of the full number of commits (6). It might be good to add one
more commit here and test again with a limit of 2. That would
handle both "equal to limit" and "less than limit" cases.

> +		# Finally, make sure that once all commits have filters, that
> +		# none are subsequently recomputed.
> +		test_bloom_filters_computed "--reachable --changed-paths --max-new-filters=2" \
> +			6 0 0
> +	)
> +'

Thanks,
-Stolee




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux