Prior to this patch series, the refs API said nothing about the lifetime of the refname parameter passed to each_ref_fn callbacks by the for_each_ref()-style iteration functions. De facto, the refnames usually had long lives because they were pointers into the ref_cache data structures, and those are only invalidated under rare circumstances. And some callers were assuming a long lifetime, for example storing references to the refname string instead of copying it. But it has long been the case that ref caches could be invalidated, for example when a packed ref is deleted. AFAIK there was never much clarity about what that might mean for callers. Recently a number of race conditions related to references have been discovered. There is likely to be a two-pronged solution to the races: * For traditional, filesystem-based references, there will have to be more checks that the ref caches are still up-to-date at the time of their use (see, for example, [1]). If not, the ref cache will have to be invalidated and reloaded. Assuming that the invalidation of the old cache includes freeing its memory, such an invalidation will cause lots of refname strings to be freed even though callers might still hold references to them. * For server-class installations, filesystem-based references might not be robust enough for 100% reliable operation, because the reading of the complete set of references is not an atomic operation. If another reference storage mechanism is developed, there is no reason to expect the refnames strings to have long lifetimes. A prerequisite for either of these approaches is to harmonize what callers assume and what the API guarantees. The purpose of this patch series is to track down callers who assume that the refnames that they receive via a each_ref_fn callback have lifetimes beyond the duration of the callback invocation and to rewrite them to work without that assumption. The final patch documents explicitly that callers should not retain references to the refnames. A word about how I audited the code: To find callers making unwarranted assumptions, I (temporarily) changed do_one_ref() to do a xstrdup() of the refname, pass the copy to the callback function, then free() the copy. This caused ill-behaved callers to access freed memory, which could be detected by running the testsuite under valgrind. There were indeed a number of such errors. All of them are fixed by this patch series, and the test just described now runs without errors. I plan to do a second audit by hand to see if the test suite and/or valgrind missed anything. The last two patches are RFCs. I would like some input on the second to last because I am not very familiar with how the object array entry names are used, how many might be created, etc. The last patch is an illustration of how I would like to change the API docs, but it will only be ready when all of the code has been audited and adapted. Please see especially my comments on these two patches for more information. [1] http://thread.gmane.org/gmane.comp.version-control.git/223299 Michael Haggerty (17): describe: make own copy of refname fetch: make own copies of refnames add_rev_cmdline(): make a copy of the name argument builtin_diff_tree(): make it obvious that function wants two entries cmd_diff(): use an object_array for holding trees cmd_diff(): rename local variable "list" -> "entry" cmd_diff(): make it obvious which cases are exclusive of each other revision: split some overly-long lines gc_boundary(): move the check "alloc <= nr" to caller get_revision_internal(): make check less mysterious object_array: add function object_array_filter() object_array_remove_duplicates(): rewrite to reduce copying fsck: don't put a void*-shaped peg in a char*-shaped hole find_first_merges(): initialize merges variable using initializer find_first_merges(): remove unnecessary code object_array_entry: copy name before storing in name field refs: document the lifetime of the refname passed to each_ref_fn builtin/describe.c | 6 +++-- builtin/diff.c | 68 ++++++++++++++++++++++++++---------------------------- builtin/fetch.c | 4 ++-- builtin/fsck.c | 2 +- object.c | 50 +++++++++++++++++++++++++++++++-------- object.h | 23 ++++++++++++++++-- refs.h | 22 +++++++++++++----- revision.c | 61 +++++++++++++++++++++++++----------------------- revision.h | 32 ++++++++++++++++--------- submodule.c | 6 ++--- 10 files changed, 172 insertions(+), 102 deletions(-) -- 1.8.2.3 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html