Re: [PATCH] fetch: replace string-list used as a look-up table with a hashmap

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Sep 26, 2018 at 02:28:28PM -0700, Junio C Hamano wrote:

> In find_non_local_tags() helper function (used to implement the
> "follow tags"), we use string_list_has_string() on two string lists
> as a way to see if a refname has already been processed, etc.
> 
> All this code predates more modern in-core lookup API like hashmap;
> replace them with two hashmaps and one string list---the hashmaps
> are used for look-ups and the string list is to keep the order of
> items in the returned result stable (which is the only single thing
> hashmap does worse than lookups on string-list).
> 
> Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx>
> ---
> 
>  * Just to remind ourselves that we talked about getting rid of
>    string-list used as look-up tables by replacing them with hashmap
>    but haven't seen much effort expended on it.  I think this is my
>    first semi-serious use of hashmap, and the code is expected to be
>    full of newbie mistakes, inefficiencies and ignorance of idioms;
>    pointing out any of which is much appreciated.

As you probably noticed, there's a bit of boilerplate required to use a
hashmap. I had figured we could replace most of these with a single
"strmap" API to map a string to a void pointer (which is basically
what string-list gives you).

That would save on the boilerplate. But your solution here replaces a
"void *" pointer with an actual "struct object_id" member, which
improves type-safety. It also removes questions about memory lifetimes
(at the minor cost of copying the oids). So this path is probably better
if we don't mind a little extra code.

I do note that your struct just has the same information as "struct
ref":

> +struct refname_hash_entry {
> +	struct hashmap_entry ent; /* must be the first member */
> +	struct object_id oid;
> +	char refname[FLEX_ARRAY];
> +};

So yet another alternative here is to just define a single hashmap that
stores void pointers. That also throws away some type safety, but is
maybe conceptually simpler. I dunno. It's actually a pain to do that
with "struct hashmap" because it requires the caller to handle
allocation. An open-addressed hash table, as we use elsewhere (and in
khash.h) is a bit simpler, since it doesn't need to do any per-entry
malloc.

To be clear, I'm perfectly happy with the approach in your patch here.
I'm just musing on what might might be the least painful thing for doing
more of them.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux