On Wed, Dec 18, 2013 at 07:48:59PM -0600, Tom Miller wrote: > I did not intend to introduce new lingo. I did some searching through > history to see if something like this had been worked on before and > I found a commit by Jeff King that introduced me the the idea of > "DF conflicts" I take all the blame. :) As for the patch itself: > >> diff --git a/builtin/fetch.c b/builtin/fetch.c > >> index e50b697..845c687 100644 > >> --- a/builtin/fetch.c > >> +++ b/builtin/fetch.c > >> @@ -868,11 +868,6 @@ static int do_fetch(struct transport *transport, > >> > >> if (tags == TAGS_DEFAULT && autotags) > >> transport_set_option(transport, TRANS_OPT_FOLLOWTAGS, "1"); > >> - if (fetch_refs(transport, ref_map)) { > >> - free_refs(ref_map); > >> - retcode = 1; > >> - goto cleanup; > >> - } > >> if (prune) { > >> /* > >> * We only prune based on refspecs specified > >> @@ -888,6 +883,11 @@ static int do_fetch(struct transport *transport, > >> transport->url); > >> } > >> } > >> + if (fetch_refs(transport, ref_map)) { > >> + free_refs(ref_map); > >> + retcode = 1; > >> + goto cleanup; > >> + } I think this is _probably_ a good thing to do, but it does have an interesting side effect for concurrent operations, and I haven't seen that mentioned so far in the discussion. Readers of the ref namespace don't have any sort of transactionally consistent view of all of the refs. So if a remote has moved a branch "foo" to "bar" and we "fetch --prune", there will be a moment where a simultaneous reader will see one of two states that never existed on the remote (depending on the order the fetch chooses): either both refs exist, or neither exists. Right now fetch creates first and deletes after, so a simultaneous reader may see both refs. After your change, it may see no refs at all. Even though both are technically wrong, the current behavior is safer. If the reader is calculating reachability (e.g., for a repack or "git prune), it is better to have too many references than too few. I'm not sure to what degree we want to care. This is a race, but it's a reasonably unlikely one, and the D/F thing bites people in the real world. And further confounding this is the fact that even if the writer does everything correctly, the way we read refs can still cause an odd view of the whole namespace. For example, consider moving "refs/heads/z/foo" to "refs/heads/a/foo", while somebody else reads simultaneously. Even with create-before-delete, we can get the sequence: 1. Reader reads "refs/heads/a/" and sees it does not contain "foo". 2. Writer writes "refs/heads/a/foo". 3. Writer deletes "refs/heads/z/foo". 4. Reader reads "refs/heads/z", which does not contain "foo". That race can be closed with a double-read of the ref namespaces, but that has poor performance. A more reasonable fix, IMHO, would be to have an alternate ref store that represents transactions atomically (keeping in mind that this really only matters for busy repos with simultaneous readers and writers, so it would not even need to be the default ref store). And once you have such a store, that solves the other problem, too: you can just treat the delete-create as a transaction anyway. It also solves a similar problem with refs that rewind. So even leaving it as-is does not make the problem go away, though the proposed change does exacerbate it somewhat. I wonder how hard it would be to do the safer thing in the common case that there is no D/F conflict. That is, do multiple passes at updating the refs: 1. Create/update any refs we can. Those with D/F conflicts are put aside for the moment. 2. Delete any refs according to the --prune rules. 3. Come back to any D/F conflicts and try them again. I dunno. As far as I know, this is not a race that people see often in real life (I do not have any confirmed cases of it yet). So it may simply not be worth worrying about. -Peff -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html