On 1/8/2023 9:38 PM, Junio C Hamano wrote: > "Derrick Stolee via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > >> +static const char *heuristics[] = { >> + [BUNDLE_HEURISTIC_NONE] = "", >> + [BUNDLE_HEURISTIC_CREATIONTOKEN] = "creationToken", >> +}; > > Ideally it would require the least amount of maintenance if we could > define BUNDLE_HEURISTIC__COUNT as ARRAY_SIZE() of this thing, but it > being a file scope static, it might not be easy to arrange that. As > a lessor altenative, would it make it safer to size this array more > explicitly using BUNDLE_HEURISTIC__COUNT macro? > > static const char *heuristics[BUNDLE_HEURISTIC__COUNT] = { > ... > }; Yes, I should have used this size indicator. > or is it more-or-less moot point to aim for safety because nobody > enforces that these [indices] used to define the contents of this > array are dense? > > That is ... > >> @@ -142,6 +150,19 @@ static int bundle_list_update(const char *key, const char *value, >> return 0; >> } >> >> + if (!strcmp(subkey, "heuristic")) { >> + int i; >> + for (i = 0; i < BUNDLE_HEURISTIC__COUNT; i++) { >> + if (!strcmp(value, heuristics[i])) { >> + list->heuristic = i; >> + return 0; >> + } >> + } > > ... this strcmp() will segfault if heuristics[] array is sparse, or > BUNDLE_HEURISTIC__COUNT is larger than the array (i.e. you add a new > heuristic in "enum bundle_heuristic" before the __COUNT sentinel, > but forget to add it to the heuristics[] array). > > "You are worrying too much. Our developers would notice a segfault > and the current code, which may look risky to you, is something they > can live with", is a perfectly acceptable response, but somehow I > have this nagging feeling that we should be able to make it easier > to maintain without incurring extra runtime cost. You're right. I was following an established pattern of linking enums to values, but I'm not sure that those other examples will loop over the array like this looking for a value. A safer approach would be to have an array of (enum, string) pairs that could either be iterated in a loop (fast enough for a small number of enum values, such as this case) or used to populate a hashmap at runtime if needed for a large number of queries. >> diff --git a/bundle-uri.h b/bundle-uri.h >> index d5e89f1671c..ad82174112d 100644 >> --- a/bundle-uri.h >> +++ b/bundle-uri.h >> @@ -52,6 +52,14 @@ enum bundle_list_mode { >> BUNDLE_MODE_ANY >> }; >> >> +enum bundle_list_heuristic { >> + BUNDLE_HEURISTIC_NONE = 0, >> + BUNDLE_HEURISTIC_CREATIONTOKEN, >> + >> + /* Must be last. */ >> + BUNDLE_HEURISTIC__COUNT, >> +}; > > The only reason to leave a trailing comma is to make it easy to > append new values at the end. By omitting the trailing comma, you > can doubly signal "Must be last" here (not suggesting to remove the > comment; suggesting to remove the trailing comma). This is a great example of "doing the typically right thing" but without thinking of _why_ we do that thing. Thanks for pointing this out. Thanks, -Stolee