On Thu, Aug 17, 2023 at 03:55:06PM -0400, Taylor Blau wrote: > > Another thing that we might want to track is whether the Bloom filter is > > a reference to an existing buffer (and thus does not need to be freed) > > or a reference to a malloc-ed buffer that we must free. But both before > > and after this patch set, a malloc-ed buffer is never overridden by a > > reference-to-existing-buffer, so we should still be fine for now. (This > > patch set does add a scenario in which a reference-to-existing buffer is > > overridden by a malloc-ed buffer, but that's the only new scenario.) > > Yeah, I think there is some opportunity for clean-up here. I'll take a > look... This ended up being pretty reasonable. I'm not sure whether I should include it here or not, since any leaks in the Bloom subsystem are definitely not new as of this series. But the patch is relatively straightforward anyway, so I think throwing it on the end would be OK: --- 8< --- diff --git a/bloom.c b/bloom.c index 24dd874e46..ff131893cd 100644 --- a/bloom.c +++ b/bloom.c @@ -59,6 +59,7 @@ int load_bloom_filter_from_graph(struct commit_graph *g, sizeof(unsigned char) * start_index + BLOOMDATA_CHUNK_HEADER_SIZE); filter->version = g->bloom_filter_settings->hash_version; + filter->to_free = NULL; return 1; } @@ -231,6 +232,18 @@ void init_bloom_filters(void) init_bloom_filter_slab(&bloom_filters); } +static void free_one_bloom_filter(struct bloom_filter *filter) +{ + if (!filter) + return; + free(filter->to_free); +} + +void deinit_bloom_filters(void) +{ + deep_clear_bloom_filter_slab(&bloom_filters, free_one_bloom_filter); +} + static int pathmap_cmp(const void *hashmap_cmp_fn_data UNUSED, const struct hashmap_entry *eptr, const struct hashmap_entry *entry_or_key, @@ -247,7 +260,7 @@ static int pathmap_cmp(const void *hashmap_cmp_fn_data UNUSED, static void init_truncated_large_filter(struct bloom_filter *filter, int version) { - filter->data = xmalloc(1); + filter->data = filter->to_free = xmalloc(1); filter->data[0] = 0xFF; filter->len = 1; filter->version = version; @@ -449,6 +462,7 @@ struct bloom_filter *get_or_compute_bloom_filter(struct repository *r, filter->len = 1; } CALLOC_ARRAY(filter->data, filter->len); + filter->to_free = filter->data; hashmap_for_each_entry(&pathmap, &iter, e, entry) { struct bloom_key key; diff --git a/bloom.h b/bloom.h index 4462fc3908..c1d74d63e6 100644 --- a/bloom.h +++ b/bloom.h @@ -56,6 +56,8 @@ struct bloom_filter { unsigned char *data; size_t len; int version; + + void *to_free; }; /* @@ -96,6 +98,7 @@ void add_key_to_filter(const struct bloom_key *key, const struct bloom_filter_settings *settings); void init_bloom_filters(void); +void deinit_bloom_filters(void); enum bloom_filter_computed { BLOOM_NOT_COMPUTED = (1 << 0), diff --git a/commit-graph.c b/commit-graph.c index 183ed90b6d..f22f2d350d 100644 --- a/commit-graph.c +++ b/commit-graph.c @@ -2532,6 +2532,9 @@ int write_commit_graph(struct object_directory *odb, res = write_commit_graph_file(ctx); + if (ctx->changed_paths) + deinit_bloom_filters(); + if (ctx->split) mark_commit_graphs(ctx); --- >8 --- Thanks, Taylor