Re: [PATCH 06/12] bloom: use num_changes not nr for limit detection

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, May 01, 2020 at 03:30:23PM +0000, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
>
> As diff_tree_oid() computes a diff, it will terminate early if the
> total number of changed paths is strictly larger than max_changes.
> This includes the directories that changed, not just the file paths.
> However, only the file paths are reflected in the resulting diff
> queue "nr" value.

Nit; s/queue/queue's

>
> Use the "num_changes" from diffopt to check if the diff terminated
> early. This is incredibly important, as it can result in incorrect
> filters! For example, the first commit in the Linux kernel repo
> reports only 471 changes, but since these are nested inside several
> directories they expand to 513 "real" changes, and in fact the
> total list of changes is not reported. Thus, the computed filter
> for this commit is incorrect.

Wow, this is a great find. Nicely done!

> Demonstrate the subtle difference by using one fewer file change
> in the 'get bloom filter for commit with 513 changes' test. Before,
> this edited 513 files inside "bigDir" which hit this inequality.
> However, dropping the file count by one demonstrates how the
> previous inequality was incorrect but the new one is correct.
>
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  bloom.c          | 2 +-
>  t/t0095-bloom.sh | 2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/bloom.c b/bloom.c
> index c3b81b1a38a..9a8386ac676 100644
> --- a/bloom.c
> +++ b/bloom.c
> @@ -215,7 +215,7 @@ struct bloom_filter *get_bloom_filter(struct repository *r,
>  		diff_tree_oid(NULL, &c->object.oid, "", &diffopt);
>  	diffcore_std(&diffopt);
>
> -	if (diff_queued_diff.nr <= max_changes) {
> +	if (diffopt.num_changes <= max_changes) {
>  		struct hashmap pathmap;
>  		struct pathmap_hash_entry *e;
>  		struct hashmap_iter iter;
> diff --git a/t/t0095-bloom.sh b/t/t0095-bloom.sh
> index 6defeb544f1..48a90625596 100755
> --- a/t/t0095-bloom.sh
> +++ b/t/t0095-bloom.sh
> @@ -100,7 +100,7 @@ test_expect_success EXPENSIVE 'get bloom filter for commit with 513 changes' '
>  	rm actual &&
>  	rm expect &&
>  	mkdir bigDir &&
> -	for i in $(test_seq 0 512)
> +	for i in $(test_seq 0 511)

Thanks for demonstrating the fix. This is change is very subtle, but I
think that it's right on.

>  	do
>  		echo $i >bigDir/$i
>  	done &&
> --
> gitgitgadget

  Reviewed-by: Taylor Blau <me@xxxxxxxxxxxx>

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux