On 2019-01-22 at 10:14 Junio C Hamano <gitster@xxxxxxxxx> wrote: > René Scharfe <l.s.r@xxxxxx> writes: > > > Am 17.01.2019 um 21:29 schrieb Barret Rhoden: > >> The blame_entry will get passed up the tree until we find a commit that > >> has a diff chunk that affects those lines. If an ignored commit added > >> more lines than it removed, the blame will fall on a commit that made a > >> change nearby. There is no general solution here, just a best-effort > >> approach. For a trivial example, consider ignoring this commit: > >> > >> Z: "Adding Lines" > >> foo > >> +No commit > >> +ever touched > >> +these lines > >> bar > > > > Wouldn't it make more sense to assign such lines to unknown, perhaps > > represented by an all-zero commit ID, instead of blaming a semi-random > > bystander? > > I share the sentiment. > > Isn't it, however, showing a bigger problem in the "feature"? > > Your "a change that adds lines without removing any" is an obvious > case where these added lines have no corresponding lines in the > preimage, but most of the time it is unclear what line corresponds > to what previous line. If a commit being "ignored" brought a change > like this: > > 1 > -four > -three > +3 > +4 > 5 > > did "+3" come from "-three"? > > Or did "+4" (read: "added '4'") come from "-three" (read: "removed > 'three'")? Did it come from "-four"? Or was it genuinely added by > that ignored commit? Your suggestion deals with the case where we > decide that "+4" had no corresponding lines in the preimage (and > paint it as "no blame can be assigned"). But when we decide that > "+4" came from "-four" or "-three", we continue drilling down from > that ignored commit and assign the blame to either the commit that > introduced "four" or the commit that introduced "three", which would > obviously give us different result. Worse yet, if a reader expected > us to consider "+4" came from "-four" at that ignored commit, but > the algorithm decided that "+4" corresponded to "-three", when we > show the commit that eventually gets blamed for that line that has > "4" has no "four" (it has "three"), which I suspect would confuse > the reader of the output. > > So... I dunno. I guess if you swap the lines as well as change them, then we're not going to be able to detect that. Just to be clear, if you did this: Commit A: --------- other_stuff +one other_stuff Commit B: --------- other_stuff one +two other_stuff Commit C: --------- other_stuff -one -two +1 +2 other_stuff And ignore commit C, my change will properly identify commit A and B, e.g.: OTHER_HASH Author Date 11) other_stuff *A_HASH Author Date 12) 1 *B_HASH Author Date 13) 2 OTHER_HASH Author Date 14) other_stuff But if you swapped the lines in addition to change names to numbers: Commit C-swap: -------------- other_stuff -one -two +2 +1 other_stuff Then it won't have the semantic knowledge that "one" == "1". If a user is ignoring a commit, we don't have an oracle that knows exactly what that commit did to determine what commit the user wants blamed. The current change attempts to find the last commit that touched a line. Barret