Jameson Miller <jamill@xxxxxxxxxxxxx> writes: > Add an API around managing the lifetime of cache_entry > structs. Abstracting memory management details behind this API will > allow for alternative memory management strategies without affecting > all the call sites. This commit does not change how memory is > allocated or freed. A later commit in this series will allocate cache > entries from memory pools as appropriate. > > Motivation: > It has been observed that the time spent loading an index with a large > number of entries is partly dominated by malloc() calls. This change > is in preparation for using memory pools to reduce the number of > malloc() calls made when loading an index. Not worth a reroll, but having these four lines at the very beginning, dropping the line "Motivation:", and then following that with "Add an API around ..." as the second paragraph, would make the above easier to read, without stutter-causing "Motivation:" in the middle. > diff --git a/apply.c b/apply.c > index 8ef975a32d..8a4a4439bc 100644 > --- a/apply.c > +++ b/apply.c > @@ -4092,12 +4092,12 @@ static int build_fake_ancestor(struct apply_state *state, struct patch *list) > return error(_("sha1 information is lacking or useless " > "(%s)."), name); > > - ce = make_cache_entry(patch->old_mode, &oid, name, 0, 0); > + ce = make_cache_entry(&result, patch->old_mode, &oid, name, 0, 0); > if (!ce) > return error(_("make_cache_entry failed for path '%s'"), > name); > if (add_index_entry(&result, ce, ADD_CACHE_OK_TO_ADD)) { > - free(ce); > + discard_cache_entry(ce); > return error(_("could not add %s to temporary index"), > name); > } So..., even though it wasn't clear in the proposed log message, two large part of the lifecycle management API is (1) make_cache_entry() knows for which istate it is creating the entry and (2) discarding the entry may not be just a simple matter of free()ing. Both of which makes perfect sense, but if the changes are that easily enumeratable, it would have been nicer for readers if the commit did so in the proposed log message. > @@ -4424,27 +4423,26 @@ static int add_conflicted_stages_file(struct apply_state *state, > struct patch *patch) > { > int stage, namelen; > - unsigned ce_size, mode; > + unsigned mode; > struct cache_entry *ce; > > if (!state->update_index) > return 0; > namelen = strlen(patch->new_name); > - ce_size = cache_entry_size(namelen); > mode = patch->new_mode ? patch->new_mode : (S_IFREG | 0644); > > remove_file_from_cache(patch->new_name); > for (stage = 1; stage < 4; stage++) { > if (is_null_oid(&patch->threeway_stage[stage - 1])) > continue; > - ce = xcalloc(1, ce_size); > + ce = make_empty_cache_entry(&the_index, namelen); ... and another one in the enumeration is make_empty_cache_entry() which is somehow different. And the readers are forced to read its implementation to find out how it is different, but the use of the same discard_cache_entry() suggests that the liftime rule of an entry allcoated by it may be similar to those created by make_cache_entry(). > ... > if (add_cache_entry(ce, ADD_CACHE_OK_TO_ADD) < 0) { > - free(ce); > + discard_cache_entry(ce); > return error(_("unable to add cache entry for %s"), > patch->new_name); > } > @@ -230,11 +230,11 @@ static int checkout_merged(int pos, const struct checkout *state) > if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid)) > die(_("Unable to add merge result for '%s'"), path); > free(result_buf.ptr); > - ce = make_cache_entry(mode, &oid, path, 2, 0); > + ce = make_transient_cache_entry(mode, &oid, path, 2); ... and then yet another, which is "transient". An intelligent reader can guess from the lack of istate parameter (and from the word "transient") that the resulting one would not belong to any in-core index. > if (!ce) > die(_("make_cache_entry failed for path '%s'"), path); > status = checkout_entry(ce, state, NULL); > - free(ce); > + discard_cache_entry(ce); ... but discovers that it is discarded the same way, realizes that ce knows how it was allocated to allow discard() different way to discard it, and his/her earlier conjecture about make_empty() does not hold at all and gets somewhat disappointed. > diff --git a/cache.h b/cache.h > index 3fbf24771a..035a627bea 100644 > --- a/cache.h > +++ b/cache.h > @@ -339,6 +339,40 @@ extern void remove_name_hash(struct index_state *istate, struct cache_entry *ce) > extern void free_name_hash(struct index_state *istate); > > > +/* Cache entry creation and cleanup */ > + > +/* > + * Create cache_entry intended for use in the specified index. Caller > + * is responsible for discarding the cache_entry with > + * `discard_cache_entry`. > + */ > +struct cache_entry *make_cache_entry(struct index_state *istate, > + unsigned int mode, > + const struct object_id *oid, > + const char *path, > + int stage, > + unsigned int refresh_options); > + > +struct cache_entry *make_empty_cache_entry(struct index_state *istate, > + size_t name_len); > + > +/* > + * Create a cache_entry that is not intended to be added to an index. > + * Caller is responsible for discarding the cache_entry > + * with `discard_cache_entry`. > + */ > +struct cache_entry *make_transient_cache_entry(unsigned int mode, > + const struct object_id *oid, > + const char *path, > + int stage); > + > +struct cache_entry *make_empty_transient_cache_entry(size_t name_len); OK, finally it becomes clear that we have per-istate and transient sets of two (i.e. one that takes the path, stage and mode pfront, and the other that only takes the length of the name), and ... > +/* > + * Discard cache entry. > + */ > +void discard_cache_entry(struct cache_entry *ce); ... a single function that knows how to discard each kind. It would have really helped the reader to talk about them in the proposed log message, as they are only 5 functions. Another way to make it easier for readers would have been to show the diff for cache.h first before diffs for all the others.