Re: [PATCH v2 4/7] merge-ort: switch our strmaps over to using memory pools

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Jul 29 2021, Jeff King wrote:

> On Thu, Jul 29, 2021 at 12:37:52PM -0600, Elijah Newren wrote:
>
>> > Arguably, the existence of these function indirections is perhaps a sign
>> > that the strmap API should provide a version of the clear functions that
>> > takes "partial / not-partial" as a parameter.
>> 
>> Are you suggesting a modification of str{map,intmap,set}_clear() to
>> take an extra parameter, or removing the
>> str{map,intmap,set}_partial_clear() functions and introducing new
>> functions that take a partial/not-partial parameter?  I think you're
>> suggesting the latter, and that makes more sense to me...but I'm
>> drawing blanks trying to come up with a reasonable function name.
>
> It does seem a shame to add the "partial" parameter to strmap_clear(),
> just because most callers don't need it (so they end up with this
> inscrutable "0" parameter).
>
> What if there was a flags field? Then it could be combined with the
> free_values parameter. The result is kind of verbose in two ways:
>
>  - now strset_clear(), etc, need a "flags" parameter, which they didn't
>    before (and is just "0" most of the time!)
>
>  - now "strmap_clear(foo, 1)" becomes "strmap_clear(foo, STRMAP_FREE_VALUES)".
>    That's a lot longer, though arguably it's easier to understand since
>    the boolean is explained.
>
> Having gone through the exercise, I am not sure it is actually making
> anything more readable (messy patch is below for reference).

I've got some WIP patches for string-list.h and strmap.h to make the API
nicer, and it's probably applicable to strset.h too.

I.e. I found when using strset.h that it was a weird API to use, because
unlike string-list.h it didn't pay attention to your "dup" field when
freeing, you had to do it explicitly.

And then in e.g. merge-ort.c there's this "strdup dance" pattern where
we flip the field back and forth.

The below diff is exctracted from that WIP work, with the relevant two
API headers and then two changed API users for show (the tree-wide
changes are much larger).

I think making the promise I make in the updated docs at "We guarantee
that the `clearfunc`[...]" in string-list.h makes for particularly nice
API behavior.

 builtin/remote.c | 37 ++++++++++++++++++++---------------
 merge-ort.c      | 32 +++++++-----------------------
 string-list.h    | 59 ++++++++++++++++++++++++++++++++++++++++++++++++++++++--
 strmap.h         | 13 +++++++++++++
 4 files changed, 98 insertions(+), 43 deletions(-)

diff --git a/builtin/remote.c b/builtin/remote.c
index 7f88e6ce9de..ec1dbd49f71 100644
--- a/builtin/remote.c
+++ b/builtin/remote.c
@@ -340,10 +340,24 @@ static void read_branches(void)
 
 struct ref_states {
 	struct remote *remote;
-	struct string_list new_refs, stale, tracked, heads, push;
+
+	struct string_list new_refs;
+	struct string_list stale;
+	struct string_list tracked;
+	struct string_list heads;
+	struct string_list push;
+
 	int queried;
 };
 
+#define REF_STATES_INIT { \
+	.new_refs = STRING_LIST_INIT_DUP, \
+	.stale = STRING_LIST_INIT_DUP, \
+	.tracked = STRING_LIST_INIT_DUP, \
+	.heads = STRING_LIST_INIT_DUP, \
+	.push = STRING_LIST_INIT_DUP, \
+}
+
 static int get_ref_states(const struct ref *remote_refs, struct ref_states *states)
 {
 	struct ref *fetch_map = NULL, **tail = &fetch_map;
@@ -355,9 +369,6 @@ static int get_ref_states(const struct ref *remote_refs, struct ref_states *stat
 			die(_("Could not get fetch map for refspec %s"),
 				states->remote->fetch.raw[i]);
 
-	states->new_refs.strdup_strings = 1;
-	states->tracked.strdup_strings = 1;
-	states->stale.strdup_strings = 1;
 	for (ref = fetch_map; ref; ref = ref->next) {
 		if (!ref->peer_ref || !ref_exists(ref->peer_ref->name))
 			string_list_append(&states->new_refs, abbrev_branch(ref->name));
@@ -406,7 +417,6 @@ static int get_push_ref_states(const struct ref *remote_refs,
 
 	match_push_refs(local_refs, &push_map, &remote->push, MATCH_REFS_NONE);
 
-	states->push.strdup_strings = 1;
 	for (ref = push_map; ref; ref = ref->next) {
 		struct string_list_item *item;
 		struct push_info *info;
@@ -449,7 +459,6 @@ static int get_push_ref_states_noquery(struct ref_states *states)
 	if (remote->mirror)
 		return 0;
 
-	states->push.strdup_strings = 1;
 	if (!remote->push.nr) {
 		item = string_list_append(&states->push, _("(matching)"));
 		info = item->util = xcalloc(1, sizeof(struct push_info));
@@ -483,7 +492,6 @@ static int get_head_names(const struct ref *remote_refs, struct ref_states *stat
 	refspec.force = 0;
 	refspec.pattern = 1;
 	refspec.src = refspec.dst = "refs/heads/*";
-	states->heads.strdup_strings = 1;
 	get_fetch_map(remote_refs, &refspec, &fetch_map_tail, 0);
 	matches = guess_remote_head(find_ref_by_name(remote_refs, "HEAD"),
 				    fetch_map, 1);
@@ -905,7 +913,7 @@ static void clear_push_info(void *util, const char *string)
 {
 	struct push_info *info = util;
 	free(info->dest);
-	free(info);
+	/* note: fixed memleak here */
 }
 
 static void free_remote_ref_states(struct ref_states *states)
@@ -1159,7 +1167,7 @@ static int get_one_entry(struct remote *remote, void *priv)
 		string_list_append(list, remote->name)->util =
 				strbuf_detach(&url_buf, NULL);
 	} else
-		string_list_append(list, remote->name)->util = NULL;
+		string_list_append(list, remote->name);
 	if (remote->pushurl_nr) {
 		url = remote->pushurl;
 		url_nr = remote->pushurl_nr;
@@ -1179,10 +1187,9 @@ static int get_one_entry(struct remote *remote, void *priv)
 
 static int show_all(void)
 {
-	struct string_list list = STRING_LIST_INIT_NODUP;
+	struct string_list list = STRING_LIST_INIT_DUP;
 	int result;
 
-	list.strdup_strings = 1;
 	result = for_each_remote(get_one_entry, &list);
 
 	if (!result) {
@@ -1212,7 +1219,7 @@ static int show(int argc, const char **argv)
 		OPT_BOOL('n', NULL, &no_query, N_("do not query remotes")),
 		OPT_END()
 	};
-	struct ref_states states;
+	struct ref_states states = REF_STATES_INIT;
 	struct string_list info_list = STRING_LIST_INIT_NODUP;
 	struct show_info info;
 
@@ -1334,8 +1341,7 @@ static int set_head(int argc, const char **argv)
 	if (!opt_a && !opt_d && argc == 2) {
 		head_name = xstrdup(argv[1]);
 	} else if (opt_a && !opt_d && argc == 1) {
-		struct ref_states states;
-		memset(&states, 0, sizeof(states));
+		struct ref_states states = REF_STATES_INIT;
 		get_remote_ref_states(argv[0], &states, GET_HEAD_NAMES);
 		if (!states.heads.nr)
 			result |= error(_("Cannot determine remote HEAD"));
@@ -1374,14 +1380,13 @@ static int set_head(int argc, const char **argv)
 static int prune_remote(const char *remote, int dry_run)
 {
 	int result = 0;
-	struct ref_states states;
+	struct ref_states states = REF_STATES_INIT;
 	struct string_list refs_to_prune = STRING_LIST_INIT_NODUP;
 	struct string_list_item *item;
 	const char *dangling_msg = dry_run
 		? _(" %s will become dangling!")
 		: _(" %s has become dangling!");
 
-	memset(&states, 0, sizeof(states));
 	get_remote_ref_states(remote, &states, GET_REF_STATES);
 
 	if (!states.stale.nr) {
diff --git a/merge-ort.c b/merge-ort.c
index ec0c5904211..53ed78e7a01 100644
--- a/merge-ort.c
+++ b/merge-ort.c
@@ -432,16 +432,6 @@ struct conflict_info {
 	assert((ci) && !(mi)->clean);        \
 } while (0)
 
-static void free_strmap_strings(struct strmap *map)
-{
-	struct hashmap_iter iter;
-	struct strmap_entry *entry;
-
-	strmap_for_each_entry(map, &iter, entry) {
-		free((char*)entry->key);
-	}
-}
-
 static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 					  int reinitialize)
 {
@@ -455,13 +445,11 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 		reinitialize ? strset_partial_clear : strset_clear;
 
 	/*
-	 * We marked opti->paths with strdup_strings = 0, so that we
-	 * wouldn't have to make another copy of the fullpath created by
-	 * make_traverse_path from setup_path_info().  But, now that we've
-	 * used it and have no other references to these strings, it is time
-	 * to deallocate them.
+	 * We used the the pattern of re-using already allocated
+	 * strings strmap_clear_strings() in make_traverse_path from
+	 * setup_path_info(). Deallocate them.
 	 */
-	free_strmap_strings(&opti->paths);
+	strmap_clear_strings(&opti->paths, 0);
 	strmap_func(&opti->paths, 1);
 
 	/*
@@ -472,15 +460,10 @@ static void clear_or_reinit_internal_opts(struct merge_options_internal *opti,
 	strmap_func(&opti->conflicted, 0);
 
 	/*
-	 * opti->paths_to_free is similar to opti->paths; we created it with
-	 * strdup_strings = 0 to avoid making _another_ copy of the fullpath
-	 * but now that we've used it and have no other references to these
-	 * strings, it is time to deallocate them.  We do so by temporarily
-	 * setting strdup_strings to 1.
+	 * opti->paths_to_free is similar to opti->paths; it's memory
+	 * we borrowed and need to free with string_list_clear_strings().
 	 */
-	opti->paths_to_free.strdup_strings = 1;
-	string_list_clear(&opti->paths_to_free, 0);
-	opti->paths_to_free.strdup_strings = 0;
+	string_list_clear_strings(&opti->paths_to_free, 0);
 
 	if (opti->attr_index.cache_nr) /* true iff opt->renormalize */
 		discard_index(&opti->attr_index);
@@ -2664,7 +2647,6 @@ static int collect_renames(struct merge_options *opt,
 	 * and have no other references to these strings, it is time to
 	 * deallocate them.
 	 */
-	free_strmap_strings(&collisions);
 	strmap_clear(&collisions, 1);
 	return clean;
 }
diff --git a/string-list.h b/string-list.h
index 0d6b4692396..9eeea996888 100644
--- a/string-list.h
+++ b/string-list.h
@@ -109,6 +109,9 @@ void string_list_init_dup(struct string_list *list);
  */
 void string_list_init(struct string_list *list, int strdup_strings);
 
+void string_list_cmp_init(struct string_list *list, int strdup_strings,
+			  compare_strings_fn cmp);
+
 /** Callback function type for for_each_string_list */
 typedef int (*string_list_each_func_t)(struct string_list_item *, void *);
 
@@ -129,14 +132,66 @@ void filter_string_list(struct string_list *list, int free_util,
  */
 void string_list_clear(struct string_list *list, int free_util);
 
+/**
+ * Free a string list initialized without `strdup_strings = 1`, but
+ * where we also want to free() the strings. You usually want to just
+ * use string_list_clear() after initializing with
+ * `STRING_LIST_INIT_DUP' instead.
+ *
+ * Useful to free e.g. a string list whose strings came from
+ * strbuf_detach() or other memory that we didn't initially allocate
+ * on the heap, but which we now manage.
+ *
+ * Under the hood this is identical in behavior to temporarily setting
+ * `strbuf_strings` to `1` for the duration of this function call, but
+ * without the verbosity of performing that dance yourself.
+ */
+void string_list_clear_strings(struct string_list *list, int free_util);
+
+/**
+ * Clear only the `util` pointer, but not the `string`, even if
+ * `strdup_strings = 1` is set. Useful for the idiom of doing e.g.:
+ *
+ *    string_list_append(&list, str + offs)->util = str;
+ *
+ * Where we add a string at some offset, own the string (so
+ * effectively `strdup_strings = `), but can't free() the string
+ * itself at the changed offset, but need to free the original data in
+ * `util` instead.
+ */
+void string_list_clear_util(struct string_list *list);
+
 /**
  * Callback type for `string_list_clear_func`.  The string associated
  * with the util pointer is passed as the second argument
  */
 typedef void (*string_list_clear_func_t)(void *p, const char *str);
 
-/** Call a custom clear function on each util pointer */
-void string_list_clear_func(struct string_list *list, string_list_clear_func_t clearfunc);
+/**
+ * Like string_list_clear() except that it first calls a custom clear
+ * function on each util pointer.
+ *
+ * We guarantee that the `clearfunc` will be called on all util
+ * pointers in a list before we proceed to free the first string or
+ * util pointer, i.e. should you need to it's OK to peek at other util
+ * items in the list itself, or to otherwise iterate it from within
+ * the `clearfunc`.
+ *
+ * You do not need to free() the passed-in util pointer itself,
+ * i.e. after calling all `clearfunc` this has the seme behavior as
+ * string_list_clear() called with with `free_util = 1`.
+ */
+void string_list_clear_func(struct string_list *list,
+			    string_list_clear_func_t clearfunc);
+
+/**
+ * Like string_list_clear_func() but free the strings too, using the
+ * same dance as described for string_list_clear_strings()
+ * above. You'll usually want to initialize with
+ * `STRING_LIST_INIT_DUP` and use string_list_clear_strings() instead.
+ */
+void string_list_clear_strings_func(struct string_list *list,
+				    string_list_clear_func_t clearfunc);
 
 /**
  * Apply `func` to each item. If `func` returns nonzero, the
diff --git a/strmap.h b/strmap.h
index 1e152d832d6..337f6278e86 100644
--- a/strmap.h
+++ b/strmap.h
@@ -51,12 +51,25 @@ void strmap_init_with_options(struct strmap *map,
  */
 void strmap_clear(struct strmap *map, int free_values);
 
+/**
+ * To strmap_clear() what string_list_clear_strings() is to
+ * string_list_clear(). I.e. free your keys too, which we used as-is
+ * without `strdup_strings = 1`.
+ */
+void strmap_clear_strings(struct strmap *map, int free_values);
+
 /*
  * Similar to strmap_clear() but leaves map->map->table allocated and
  * pre-sized so that subsequent uses won't need as many rehashings.
  */
 void strmap_partial_clear(struct strmap *map, int free_values);
 
+/**
+ * To strmap_partial_clear() what string_list_clear_strings() is to
+ * string_list_clear(). See strmap_clear_strings() above.
+ */
+void strmap_partial_clear_strings(struct strmap *map, int free_values);
+
 /*
  * Insert "str" into the map, pointing to "data".
  *



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux