On 15 August 2017 at 20:48, Stefan Beller <sbeller@xxxxxxxxxx> wrote: >>>> /* total number of entries (0 means the hashmap is empty) */ >>>> - unsigned int size; >>>> + /* -1 means size is unknown for threading reasons */ >>>> + int size; >>> >>> This double-encodes the state of disallow_rehash (i.e. if we had >>> signed size, then the invariant disallow_rehash === (size < 0) >>> is true, such that we could omit either the flag and just check for >>> size < 0 or we do not need the negative size as any user would >>> need to check disallow_rehash first. Not sure which API is harder >>> to misuse. I'd think just having the size and getting rid of >>> disallow_rehash might be hard to to reused. >> >> (Do you mean "might be hard to be misused"?) > > yes, I do. > >> One good thing about turning off the size-tracking with threading is >> that someone who later wants to know the size in a threaded application >> will not introduce any subtle bugs by misusing size, but will be forced >> to provide and use some sort of InterlockedIncrement(). > > agreed. > >> When/if that >> change happens, it would be nice if no-one relied on the value of size >> to say anything about threading. So it might make sense to have an >> implementation-independent way of accessing disallow_rehash a.k.a. >> (size < 0). > > Yes, and my point was whether we want to keep disallow_rehash around, > as when a patch as this is applied, we'd have it encoded twice, > both size < 0 as well as disallow_rehash set indicate the rehashing > disabled. > > If we were to reduce it to one, we would not have "invalid" state possible > such as size < 0 and disallow_rehash = 0. Agreed. > In the future we may have more options that make size impossible to > compute efficiently, such that in that case we'd want to know which > condition lead to it. In that case we'd want to have the flags around. Good point. >> For example a function hashmap_disallow_rehash(), except that's >> obviously taken. :-) Maybe the existing function would then be >> hashmap_set_disallow_rehash(). Oh well.. > > Not sure I understand this one. Sorry. What I meant was, if we drop the disallow_rehash-field, someone might be tempted to use size < 0 (or size == -1) to answer the question "is rehashing disallowed?". (Or "am I threaded?" which already is a question which the hashmap as it is today doesn't know about.) So instead of looking at "disallow_rehash" one should perhaps be calling "hashmap_is_disallow_rehash()" or "hashmap_get_disallow_rehash()", which would be implemented as "return disallow_rehash", or possibly "return size == -1". Except such names are, to the best of my understanding, not the Git-way, so it should be, e.g., "hashmap_disallow_rehash()". Except ... that name is taken.... So to free that name up, the existing function should perhaps be renamed "hashmap_set_disallow_rehash()", again assuming I've picked up the right conventions in my recent browsing of the Git-code. The final "Oh well" was a short form of "it began with an observation which currently has no practical effect, and is slowly turning into a chain of ideas on how to rebuild the interface".