Junio C Hamano <gitster@xxxxxxxxx> wrote in "What's cooking in git.git (Feb 2009, #06; Wed, 18)" > * jc/blame (Wed Jun 4 22:58:40 2008 -0700) 2 commits > + blame: show "previous" information in --porcelain/--incremental > format > + git-blame: refactor code to emit "porcelain format" output > > This gives Porcelains (like gitweb) the information on the commit > _before_ the one that the final blame is laid on, which should save > them one rev-parse to dig further. The line number in the "previous" > information may need refining, and sanity checking code for reference > counting may need to be resurrected before this can move forward. > > I thought recent tig discussion may blow new life into it, but is > this unneeded? If so I'd rather revert it (or discard after 1.6.2). The commit message for second patch in this series has the following: blame: show "previous" information in --porcelain/--incremental format When the final blame is laid for a line to a <commit, path> pair, it also gives a "previous" information to --porcelain and --incremental output format. It gives the parent commit of the blamed commit, _and_ a path in that parent commit that corresponds to the blamed path --- in short, it is the origin that would have been blamed (or passed blame through) for the line _if_ the blamed commit did not change that line. (The patch itself doesn't include update to the documentation.) This I guess mean that --porcelain and --incremental output have additional header: "previous" <sha-1 of parent> <whitespace-quoted-filename> I also guess that it is a merge commit that got blamed (because it was evil merge, otherwise one of parents or its descendants would get the blame) we would get two or more "previous" info lines, in the order of ordering of parents. I assume that filename in "previous" info can differ from filename in blamed commit only wrt. wholesame filename detection, and does not do detection of code fragment movements by itself... or does it? This info would be even more helpful for gitweb that I thought because of 'filename' part; we can simply relax refname restrictions and use <blamed commit>^ or <blamed commit>^<n> for 'hb' parameter, but filename gives some troubles (although it should happen rarely). Well, in one of solutions I thought of there was intermediate step where gitweb resolved <ref>^ to <sha1>, and did HTTP redirection; in this solution there is a place where gitweb can find previous filename (filename in <rev>^, given filename in <rev>), but it would be a mess. Luben Tuikov in 244a70e6 (Blame "linenr" link jumps to previous state at "orig_lineno") made gitweb link to previous version of a file (using always first parent), for better data mining, or in other words to be able to follow history of a given line. Current code makes a few assumptions: * we are always interested in first parent; this matters only for 'evil merges', it the merge commit itself was blamed, which should be fairly rare case * the name of a file is the same in parent as in blamed commit; we would have to run git-diff-tree to check it without proposed "previous" header in blame output, all for rare case of file rename, or complicate a bit resolving filename after clicking link * previous version of given line is at the same position in a file in a parent; or at least it is close It is the last assumption that is, I think, hardest to correct. What algorithm do you propose to find previous version of a line? It is not a question with definitive answer, I think, so some heuristic would be required. Previous version of a line might not even exists! (in that case we would probably want to be in the place it is inserted). Fortunately this is a situation where approximation is good enough. (I don't know if git-blame has access to textual diff, or at least information in chunk headers when calculating blame information, so I don't know if the following algorithm is feasible.) I propose the following algorithm: * find a hunk in textual diff which postimage contains current version of a line: searching hunk headers for line number should be enough here * get line numbers for corresponding preimage (I'm not sure if this algorithm wouldn't fail here if code movement detection is enabled) * either find most similar line in preimage, or calculate (perhaps with linear interpolation) where given line number in postimage line range corresponds to in preimage line range What do you think about this algorithm? Is it good enough? P.S. I think that going to the blamed commit version might be also interesting: you can check how the neighbourhood if given line changed, isn't it? -- Jakub Narebski Poland -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html