Am 03.07.21 um 12:38 schrieb Jeff King: > On Sat, Jul 03, 2021 at 12:05:46PM +0200, René Scharfe wrote: > >> We use our standard allocation functions and macros (xcalloc, >> ALLOC_ARRAY, REALLOC_ARRAY) in our version of khash.h. They terminate >> the program on error, so code that's using them doesn't have to handle >> allocation failures. Make this behavior explicit by replacing the code >> that handles allocation errors in kh_resize_ and kh_put_ with BUG calls. > > Seems like a good idea. > > We're very sloppy about checking the "ret" field from kh_put_* for > errors (it's a tri-state for "already existed", "newly added", or > "error"). I think that's not a problem because as you show here, we > can't actually hit the error case. This makes that much more obvious. > > Two nits if we wanted to go further: > >> diff --git a/khash.h b/khash.h >> index 21c2095216..84ff7230b6 100644 >> --- a/khash.h >> +++ b/khash.h >> @@ -126,7 +126,7 @@ static const double __ac_HASH_UPPER = 0.77; >> if (h->size >= (khint_t)(new_n_buckets * __ac_HASH_UPPER + 0.5)) j = 0; /* requested size is too small */ \ >> else { /* hash table size to be changed (shrink or expand); rehash */ \ >> ALLOC_ARRAY(new_flags, __ac_fsize(new_n_buckets)); \ >> - if (!new_flags) return -1; \ >> + if (!new_flags) BUG("ALLOC_ARRAY failed"); \ > > I converted this in b32fa95fd8 (convert trivial cases to ALLOC_ARRAY, > 2016-02-22), but left the now-obsolete error-check. > > But a few lines below... > >> memset(new_flags, 0xaa, __ac_fsize(new_n_buckets) * sizeof(khint32_t)); \ >> if (h->n_buckets < new_n_buckets) { /* expand */ \ >> REALLOC_ARRAY(h->keys, new_n_buckets); \ > > These REALLOC_ARRAY() calls are in the same boat. You dropped the error > check in 2756ca4347 (use REALLOC_ARRAY for changing the allocation size > of arrays, 2014-09-16). > > Should we make the two match? I'd probably do so by making the former > match the latter, and just drop the conditional and BUG entirely. Yeah, makes sense, thank you. > >> @@ -181,10 +181,10 @@ static const double __ac_HASH_UPPER = 0.77; >> if (h->n_occupied >= h->upper_bound) { /* update the hash table */ \ >> if (h->n_buckets > (h->size<<1)) { \ >> if (kh_resize_##name(h, h->n_buckets - 1) < 0) { /* clear "deleted" elements */ \ >> - *ret = -1; return h->n_buckets; \ >> + BUG("kh_resize_" #name " failed"); \ >> } \ >> } else if (kh_resize_##name(h, h->n_buckets + 1) < 0) { /* expand the hash table */ \ >> - *ret = -1; return h->n_buckets; \ >> + BUG("kh_resize_" #name " failed"); \ > > After the first hunk, does kh_resize_*() ever return anything but 0? If > not, can we drop its return entirely, making it more clear that it's not > expected to fail? Both for human readers, but also for the compiler > (which could then alert us at compile-time if we missed any error > cases). Good idea. Both type of changes make syncing with upstream a bit harder, but even though the return type change bleeds into the caller, the overall change affects only a small area. René