On Thu, Nov 05, 2020 at 12:25:14PM -0800, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > Subject: [PATCH] shortlog: drop custom strset implementation > > > > We can use the strset recently added in strmap.h instead. Note that this > > doesn't have a "check_and_add" function. We can easily write the same > > thing using separate "contains" and "adds" calls. This is slightly less > > efficient, in that it hashes the string twice, but for our use here it > > shouldn't be a big deal either way. > > > > I did leave it as a separate helper function, though, since we use it in > > three separate spots (some of which are in the middle of a conditional). > > It makes sense, but check_dup() sounds like its use pattern would be > > if (check_dup(it) == NO_DUP) > add(it); > > where it is more like "add it but just once". Hmph. I picked that name (versus just "contains") hoping it be general enough to cover the dual operation. Better name suggestions are welcome. Though... > By the way, is a strset a set or a bag? If it makes the second add > an no-op, perhaps your check_dup() is what strset_add() should do > itself? What builtin/shortlog.c::check_dup() does smells like it is > a workaround for the lack of a naturally-expected feature. Yes, if strset_add() returned an integer telling us whether the item was already in the set, then we could use it directly. It's slightly non-trivial to do, though, as it's built around strmap_put(), which returns a pointer to the old value. But since we're a set and not a map, that value is always NULL; we can't tell the difference between "I was storing an old value which was NULL" and "I was not storing any value". If strset were built around strintmap it could store "1" for "present in the set". It somehow feels hacky, though, to induce extra value writes just for the sake of working around the API. Since strset is defined within strmap.c, perhaps it wouldn't be too bad for it to be more intimate with the details here. I.e., to use find_strmap_entry() directly, and if the value is not present, to create a new hashmap entry. That would require hacking up strmap_put() into a few helpers, but it's probably not too bad. -Peff