David Kastrup <dak@xxxxxxx> writes: > Junio C Hamano <gitster@xxxxxxxxx> writes: > >> David Kastrup <dak@xxxxxxx> writes: >> >>> The previous implementation uses a sorted linear list of struct >>> blame_entry in a struct scoreboard for organizing all partial or >>> completed work. Every task that is done requires going through the >>> whole list where most entries are not relevant to the task at hand. >>> >>> This commit reorganizes the data structures in order to have each >>> remaining subtask work with its own sorted linear list it can work off >>> front to back. Subtasks are organized into "struct origin" chains >>> hanging off particular commits. Commits are organized into a priority >>> queue, processing them in commit date order in order to keep most of >>> the work affecting a particular blob collated even in the presence of >>> an extensive merge history. In that manner, linear searches can be >>> basically avoided anywhere. They still are done for identifying one >>> of multiple analyzed files in a given commit, but the degenerate case >>> of a single large file being assembled from a multitude of smaller >>> files in the past is not likely to occur often enough to warrant >>> special treatment. >>> --- >> >> Sign-off? > > Not while this is not fit for merging because of #if 0 etc and missing > functionality. This is just for review. That is not what Signing off a patch means, though ;-) >> Actually, I'd like to take my previous suggestion to add this as >> blame2 (to replace blame in the future) back. That was based on my >> fear/hope to see an implementation based on a completely different >> approach, but the basic premise of having one blame_entry record per >> the lines of the final image in the scoreboard, using diff between >> parents to the child to find common lines for passing the blame up, >> etc. have not changed at all and the change is all about organizing >> the pieces of data in a *much* *better* way to avoid needlessly >> finding what we already have computed. > > Yes. Basically, the call graph outside of blame.c itself should be > pretty much the same. > ... > Ok. I'm also certain to have a few "space between function name and > arguments" left and will grep for those before submitting the next > version. > > Next version will at least include -M option, possibly leaving -C for > later. OK. As we do not want to break the build in the middle of the series, but we still want to keep the individual steps reviewable incrementally, I would think the best structure would be have the series that consists of multiple steps "the basic infrastructure is there, but no rename following, and neither -M or -C works", "now renames are followed", "now -M works", etc., with tests that are not yet working (in the beginning, most of "git blame" test may no longer work until the series is finished) marked with s/test_expect_success/test_expect_failure/ and turn expect_failure into expect_success as the series advances. That way, we may get a better picture of what each step achieves. I dunno. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html