On 2/7/2020 10:09 AM, Garima Singh wrote: > > On 2/7/2020 8:52 AM, SZEDER Gábor wrote: >>> * Added unit tests for the bloom filter computation layer >> >> This fails on big endian, e.g. in Travis CI's s390x build: >> >> https://travis-ci.org/szeder/git-cooking-topics-for-travis-ci/jobs/647253022#L2210 >> >> (The link highlights the failure, but I'm afraid your browser won't >> jump there right away; you'll have to click on the print-test-failures >> fold at the bottom, and scroll down a bit...) >> > > Thank you so much for running this pipeline and pointing out the error! > > We will carefully review our interactions with the binary data and > hopefully solve this in the next version. Szeder, Thanks so much for running this test. We don't have access to a big endian machine right now, so could you please apply this patch and re-run your tests? The issue is described in the message below, and Garima is working to ensure the handling of the filter data is clarified in the next version. This is an issue from WAY back in the original prototype, and it highlights that we've never been writing the data in network-byte order. This is completely my fault. Thanks, -Stolee -->8-- >From c1067db5d618b2dae430dfe373a11c771517da9e Mon Sep 17 00:00:00 2001 From: Derrick Stolee <dstolee@xxxxxxxxxxxxx> Date: Fri, 7 Feb 2020 10:24:05 -0500 Subject: [PATCH] fixup! bloom: core Bloom filter implementation for changed paths The 'data' field of 'struct bloom_filter' can point to a memory location (when computing one before writing to the commit-graph) or a memmap()'d file location (when reading from the Bloom data chunk of the commit-graph file). This means that the memory representation may be backwards in Little Endian or Big Endian machines. Always write and read bits from 'filter->data' using network order. This allows us to avoid loading the data streams from the file into memory buffers. Signed-off-by: Derrick Stolee <dstolee@xxxxxxxxxxxxx> --- bloom.c | 6 ++++-- t/helper/test-bloom.c | 2 +- 2 files changed, 5 insertions(+), 3 deletions(-) diff --git a/bloom.c b/bloom.c index 90d84dc713..aa6896584b 100644 --- a/bloom.c +++ b/bloom.c @@ -124,8 +124,9 @@ void add_key_to_filter(struct bloom_key *key, for (i = 0; i < settings->num_hashes; i++) { uint64_t hash_mod = key->hashes[i] % mod; uint64_t block_pos = hash_mod / BITS_PER_WORD; + uint64_t bit = get_bitmask(hash_mod); - filter->data[block_pos] |= get_bitmask(hash_mod); + filter->data[block_pos] |= htonll(bit); } } @@ -269,7 +270,8 @@ int bloom_filter_contains(struct bloom_filter *filter, for (i = 0; i < settings->num_hashes; i++) { uint64_t hash_mod = key->hashes[i] % mod; uint64_t block_pos = hash_mod / BITS_PER_WORD; - if (!(filter->data[block_pos] & get_bitmask(hash_mod))) + uint64_t bit = get_bitmask(hash_mod); + if (!(filter->data[block_pos] & htonll(bit))) return 0; } diff --git a/t/helper/test-bloom.c b/t/helper/test-bloom.c index 9b4be97f75..09b2bb0a00 100644 --- a/t/helper/test-bloom.c +++ b/t/helper/test-bloom.c @@ -23,7 +23,7 @@ static void print_bloom_filter(struct bloom_filter *filter) { printf("Filter_Length:%d\n", filter->len); printf("Filter_Data:"); for (i = 0; i < filter->len; i++){ - printf("%"PRIx64"|", filter->data[i]); + printf("%"PRIx64"|", ntohll(filter->data[i])); } printf("\n"); } -- 2.25.0.vfs.1.1.1.g9906319d24.dirty