Re: [PATCH v2 00/11] Changed Paths Bloom Filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 07, 2020 at 10:36:58AM -0500, Derrick Stolee wrote:
> On 2/7/2020 10:09 AM, Garima Singh wrote:
> > 
> > On 2/7/2020 8:52 AM, SZEDER Gábor wrote:
> >>>  * Added unit tests for the bloom filter computation layer
> >>
> >> This fails on big endian, e.g. in Travis CI's s390x build:
> >>
> >>   https://travis-ci.org/szeder/git-cooking-topics-for-travis-ci/jobs/647253022#L2210
> >>
> >> (The link highlights the failure, but I'm afraid your browser won't
> >> jump there right away; you'll have to click on the print-test-failures
> >> fold at the bottom, and scroll down a bit...)
> >>
> > 
> > Thank you so much for running this pipeline and pointing out the error!
> > 
> > We will carefully review our interactions with the binary data and 
> > hopefully solve this in the next version. 
> 
> Szeder,
> 
> Thanks so much for running this test. We don't have access to a big endian
> machine right now, so could you please apply this patch and re-run your tests?

Unfortunately, it still failed:

  https://travis-ci.org/szeder/git-cooking-topics-for-travis-ci/jobs/647395554#L2204

> The issue is described in the message below, and Garima is working to ensure
> the handling of the filter data is clarified in the next version.
> 
> This is an issue from WAY back in the original prototype, and it highlights
> that we've never been writing the data in network-byte order. This is completely
> my fault.
> 
> Thanks,
> -Stolee
> 
> 
> -->8--
> 
> From c1067db5d618b2dae430dfe373a11c771517da9e Mon Sep 17 00:00:00 2001
> From: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> Date: Fri, 7 Feb 2020 10:24:05 -0500
> Subject: [PATCH] fixup! bloom: core Bloom filter implementation for changed
>  paths
> 
> The 'data' field of 'struct bloom_filter' can point to a memory location
> (when computing one before writing to the commit-graph) or a memmap()'d
> file location (when reading from the Bloom data chunk of the commit-graph
> file). This means that the memory representation may be backwards in
> Little Endian or Big Endian machines.
> 
> Always write and read bits from 'filter->data' using network order. This
> allows us to avoid loading the data streams from the file into memory
> buffers.
> 
> Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx>
> ---
>  bloom.c               | 6 ++++--
>  t/helper/test-bloom.c | 2 +-
>  2 files changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/bloom.c b/bloom.c
> index 90d84dc713..aa6896584b 100644
> --- a/bloom.c
> +++ b/bloom.c
> @@ -124,8 +124,9 @@ void add_key_to_filter(struct bloom_key *key,
>  	for (i = 0; i < settings->num_hashes; i++) {
>  		uint64_t hash_mod = key->hashes[i] % mod;
>  		uint64_t block_pos = hash_mod / BITS_PER_WORD;
> +		uint64_t bit = get_bitmask(hash_mod);
>  
> -		filter->data[block_pos] |= get_bitmask(hash_mod);
> +		filter->data[block_pos] |= htonll(bit);
>  	}
>  }
>  
> @@ -269,7 +270,8 @@ int bloom_filter_contains(struct bloom_filter *filter,
>  	for (i = 0; i < settings->num_hashes; i++) {
>  		uint64_t hash_mod = key->hashes[i] % mod;
>  		uint64_t block_pos = hash_mod / BITS_PER_WORD;
> -		if (!(filter->data[block_pos] & get_bitmask(hash_mod)))
> +		uint64_t bit = get_bitmask(hash_mod);
> +		if (!(filter->data[block_pos] & htonll(bit)))
>  			return 0;
>  	}
>  
> diff --git a/t/helper/test-bloom.c b/t/helper/test-bloom.c
> index 9b4be97f75..09b2bb0a00 100644
> --- a/t/helper/test-bloom.c
> +++ b/t/helper/test-bloom.c
> @@ -23,7 +23,7 @@ static void print_bloom_filter(struct bloom_filter *filter) {
>  	printf("Filter_Length:%d\n", filter->len);
>  	printf("Filter_Data:");
>  	for (i = 0; i < filter->len; i++){
> -		printf("%"PRIx64"|", filter->data[i]);
> +		printf("%"PRIx64"|", ntohll(filter->data[i]));
>  	}
>  	printf("\n");
>  }
> -- 
> 2.25.0.vfs.1.1.1.g9906319d24.dirty
> 
> 
> 



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux