Am 01.10.2018 um 22:26 schrieb Jeff King: > On Mon, Oct 01, 2018 at 09:15:53PM +0200, René Scharfe wrote: > The reason hashmap.c was added was to avoid open addressing. ;) Because efficient removal of elements is easier to implement with chaining, according to 6a364ced49 (add a hashtable implementation that supports O(1) removal). khash.h deletes using its flags bitmap. We didn't compare their performance when entries are removed so far. > So yeah, I think it could perhaps be improved, but in my mind talking > about "hashmap.c" is fundamentally talking about chained buckets. Admittedly I wouldn't touch hashmap.c, as I find its interface too complex to wrap my head around. But perhaps I just didn't try hard enough, yet. >> But I like how khash.h is both already in the tree and also really easy >> to deploy, as it's just a single header file. It's a tasty low-hanging >> fruit. > > Yeah. And if it really does perform better, I think we should stick with > it in the code base. I wonder if we could stand to clean up the > interfaces a little. E.g., I had a hard time declaring a hash in one > place, and then defining it somewhere else. You can't use KHASH_DECLARE and KHASH_INIT together, as both declare the same structs. So I guess the idea is to have a header file with KHASH_DECLARE and a .c file with KHASH_INIT, the latter *not* including the former, but both including khash.h. I didn't actually try that, though. > And I think as you found > that it insists on heap-allocating the hash-table struct itself, which > does not match our usual style. Perhaps we can fix that with little effort (see below). >> This is straight-forward, except for oidset_clear(), which needs to >> allocate a kh_oid_t on the heap in order to be able to feed it to >> kh_destroy_oid() for release it. Alternatively we could open-code the >> relevant parts of the latter, but that would be a layering violation. > > This is kind of a layering violation, too. You're assuming that struct > assignment is sufficient to make one kh struct freeable from another > pointer. That's probably reasonable, since you're just destroying them > both (e.g., some of our FLEX structs point into their own struct memory, > making a hidden dependency; but they obviously would not need to free > such a field). Fair enough. How about this on top? (The khash.h part would go in first in a separate patch in a proper series.) NB: I stuck to the 4-spaces-tabs formatting in khash.h here. --- khash.h | 9 +++++++-- oidset.c | 4 +--- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/khash.h b/khash.h index 07b4cc2e67..d10caa0c35 100644 --- a/khash.h +++ b/khash.h @@ -82,11 +82,16 @@ static const double __ac_HASH_UPPER = 0.77; SCOPE kh_##name##_t *kh_init_##name(void) { \ return (kh_##name##_t*)xcalloc(1, sizeof(kh_##name##_t)); \ } \ + SCOPE void kh_release_##name(kh_##name##_t *h) \ + { \ + free(h->flags); \ + free((void *)h->keys); \ + free((void *)h->vals); \ + } \ SCOPE void kh_destroy_##name(kh_##name##_t *h) \ { \ if (h) { \ - free((void *)h->keys); free(h->flags); \ - free((void *)h->vals); \ + kh_release_##name(h); \ free(h); \ } \ } \ diff --git a/oidset.c b/oidset.c index d15b2b7a89..9836d427ef 100644 --- a/oidset.c +++ b/oidset.c @@ -25,8 +25,6 @@ int oidset_remove(struct oidset *set, const struct object_id *oid) void oidset_clear(struct oidset *set) { - kh_oid_t *to_free = kh_init_oid(); - *to_free = set->set; - kh_destroy_oid(to_free); + kh_release_oid(&set->set); oidset_init(set, 0); } -- 2.19.0