On 8 May 2012 09:13, Kelly Dean <kellydeanch@xxxxxxxxx> wrote: > > --- On Mon, 5/7/12, PJ Weisberg <pj@xxxxxxxxxxxxxxxxxxxxxxxx> wrote: > > But there could be any number of unrelated commits newer than "Bar" > > but older than "Revert Bar" on other branches. Even if you could > > trust the timestamps to be accurate (you can't), you still can't > > determine a commit's parent unambiguously. > Therefore, provenance does matter, and it must be explicitly recorded > because it can't necessarily be correctly and fully deduced from content > alone. And git does record inter-commit provenance. > However, git doesn't record intra-commit provenance, as I mentioned in my > original message. My question is: why this discrepancy? Either provenance > matters, or it doesn't; why record it in one case but not the other? I don't think it is firmly decided that provenance is not important in the intra-commit scope, rather that as you stated such information is not available to us. My understanding is that git makes a best guess effort to track the flow of content through the repository. If the content is moved, by deleting in one place and adding in another it is easy to see that in git, however if content is merely added, and that same content occurs in multiple places in the repository, there is no sane way of knowing where that content came from. Even if the content that was added only occurred in one other place, you would need to check every single file for every single hunk added every single commit in order to be able to determine just where this content came from. Why stop there though? It's possible we are copying the content from some other branch we don't have checked out at the moment, so every time we commit, let's search the entire repositories history for an occurrence of each hunk we are adding. This way is madness. With regards to file renames, all that has been shown so far is that provenance matters for commit renames. Nothing about the similarities between the commit parent and rename situations you mention leads me to concluded that because provenance is important to one it is important to the other. Indeed, one of the arguments against provenance being important in the file rename case is that generally we can determine this information from the existing information, as opposed to the general commit parent case. There are additional arguments, such as simply recording file name changes doesn't capture many situations we would like to know about, for example when a single file is split into two files. Tracking the content of those files, and hence being able to deduce where their content came from, solves this and the general rename situation. Trying to guess which file was 'renamed' and which is 'new' when a file is actually split into two new files would lead to misleading and incomplete information in the end. So just because provenance matters in some situations doesn't mean it matters in all (at least in the way we have been applying 'matters'), furthermore there are additional reasons why the existing content-tracking system is beneficial. Extra layers of rename encoding or the 'heritage of data chunks' would be extra work with little added benefit (though there are a few corner cases, from memory, where automatic rename detection fails and so /some/ benefit would be seen). Regards, Andrew Ardill -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html