Re: [PATCH v4 02/14] add a hashtable implementation that supports O(1) removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Karsten Blees <karsten.blees@xxxxxxxxx> writes:

> What about this:
>
> #define HASHMAP_GROW_AT 80
> #define HASHMAP_SHRINK_AT 16

I am not too enthused for three reasons. The fact that these are
100-based numbers is not written down anywhere other than the place
they are used, the places that use these need to consistently divide
by 100, which invites unnecessary bugs, and compared to the
original, you now require 16/100 but you didn't even want the exact
16% in the first plae (i.e. a simple 1/6 was good enough, and it
still is).

>> Perhaps
>> 
>> #define HASHMAP_GROW_AT(current) ((current) + (current) >> 2)
>> #define HASHMAP_SHRINK_AT(current) ((current) * 6)
>> #define HASHMAP_GROW(current) ((current) << 2)
>> #define HASHMAP_SHRINK(current) ((current) >> 2)
>> 
>> may alleviate my worries; I dunno.

>>> +
>>> +void hashmap_free(struct hashmap *map, hashmap_free_fn free_function)
>>> +{
>> 
>> Why is free_function not part of the constants defiend at
>> hashmap_init() time?  Your API allows the same hashmap, depending on
>> the way it has been used, to be cleaned up with different
>> free_function, but I am not sure if that "flexibility" is intended
>> (and in what application it would be useful).
>> 
>
> The free_function is a convenience so you don't have to loop over
> the entries yourself. ...
> ...a simple 'to free or not to free' boolean would suffice.

That is not the "flexibility" I was talking about. Your API allows
omne to write a single program that does this:

	struct hashmap map;

	hashmap_init(&map, compare_fn);
        add/put/remove on map;

	if (phase_of_moon())
        	hashmap_free(&map, free_them_in_one_way);
	else
        	hashmap_free(&map, free_them_in_another_way);

Just like your _init takes a comparison function to make it clear
that all entries will be compared using the same function throughout
the life of the map, if it takes a free function (and you can use
NULL to mean "do not free, I am managing elements myself"), I would
think that it will make it clear that the elements in that map will
be freed the same way.

And it will allow your _put to call that free function when you
replace an existing entry with a new one, if that becomes necessary.
The API in the posted version seems to make it responsibility of the
caller of _put to do whatever necessary clean-up to the returned
value (which is the entry that was replaced and no longer in the
hashmap), but within the context of a patch series whose later patch
changes the API to replace or remove an entry from the index in such
a way to shift the responsibility of freeing it from the caller to
the callee, such a change to this API to make _put and _remove
responsible for calling per-element free is a possiblity you may
want to consider, no?

>>> +	if (map->tablesize > HASHMAP_INITIAL_SIZE &&
>>> +	    map->size * HASHMAP_SHRINK_AT < map->tablesize)
>>> +		rehash(map, map->tablesize >> HASHMAP_GROW);
>> 
>> This "we shrink by the same amount" looks inconsistent with the use
>> of separate grow-at and shrink-at constants (see above for four
>> suggested #define's).
>> 
>
> These values account for a small hysteresis so that there is no size at which a sequence of add, remove, add, remove (or put, put, put, put) results in permanent resizes.

I was commenting on the two bottom lines of the above three line
quote from the patch.  You use SHIRNK_AT to decide if you want to
shrink, and you use >>GROW to do the actual shrinking.  Why isn't it
like this instead?

	if (map->tablesize > HASHMAP_INITIAL_SIZE &&
	    HASHMAP_SHIRNK_AT(map->size) < map->tablesize)
		rehash(map, map->tablesize >> HASHMAP_SHRINK);

The fact that constant used for shrinking was not called SHRINK but
GROW was what caught my attention.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]