On 12/29/2019 1:24 AM, Jeff King wrote: > On Fri, Dec 27, 2019 at 11:11:37AM -0500, Derrick Stolee wrote: > >>> So here are a few patches to reduce the CPU and memory usage. They could >>> be squashed in at the appropriate spots, or perhaps taken as inspiration >>> if there are better solutions (especially for the first one). >> >> I tested these patches with the Linux kernel repo and reported my results >> on each patch. However, I wanted to also test on a larger internal repo >> (the AzureDevOps repo), which has ~500 commits with more than 512 changes, >> and generally has larger diffs than the Linux kernel repo. >> >> | Version | Time | Memory | >> |----------|--------|--------| >> | Garima | 16m36s | 27.0gb | >> | Peff 1 | 6m32s | 28.0gb | >> | Peff 2 | 6m48s | 5.6gb | >> | Peff 3 | 6m14s | 4.5gb | >> | Shortcut | 3m47s | 4.5gb | >> >> For reference, I found the time and memory information using >> "/usr/bin/time --verbose" in a bash script. > > Thanks for giving it more exercise. My heap profiling was done with > massif, which measures the heap directly. Measuring RSS would cover > that, but will also include the mmap'd packfiles. That's probably why > your linux.git numbers were slightly higher than mine. That's interesting. I initially avoided massif because it is so much slower than /usr/bin/time. However, the inflated numbers could be explained by that. Also, the distinction between mem_heap and mem_heap_extra may have interesting implications. Looking online, it seems that large mem_heap_extra implies the heap is fragmented from many small allocations. Here are my findings on the Linux repo: | Version | mem_heap | mem_heap_extra | |----------|----------|----------------| | Peff 1 | 6,500mb | 913mb | | Peff 2 | 3,100mb | 286mb | | Peff 3 | 781mb | 235mb | These numbers more closely match your numbers (in sum of the two columns). > (massif is a really great tool if you haven't used it, as it also shows > which allocations were using the memory. But it's part of valgrind, so > it definitely doesn't run on native Windows. It might work under WSL, > though. I'm sure there are also other heap profilers on Windows). I am using my Linux machine for my tests. Garima is using her Windows machine. >> By "Shortcut" in the table above, I mean the following patch on top of >> Garima's and Peff's changes. It inserts a max_changes option into struct >> diff_options to halt the diff early. This seemed like an easier change >> than creating a new tree diff algorithm wholesale. > > Yeah, I'm not opposed to a diff feature like this. > > But be careful, because... > >> diff --git a/diff.h b/diff.h >> index 6febe7e365..9443dc1b00 100644 >> --- a/diff.h >> +++ b/diff.h >> @@ -285,6 +285,11 @@ struct diff_options { >> /* Number of hexdigits to abbreviate raw format output to. */ >> int abbrev; >> >> + /* If non-zero, then stop computing after this many changes. */ >> + int max_changes; >> + /* For internal use only. */ >> + int num_changes; > > This is holding internal state in diff_options, but the same > diff_options is often used for multiple diffs (e.g., "git log --raw" > would use the same rev_info.diffopt over and over again). > > So it would need to be cleared between diffs. There's a similar feature > in the "has_changes" flag, though it looks like it is cleared manually > by callers. Yuck. You're right about this. What if we initialize it in diff_tree_paths() before it calls the recursive ll_difF_tree_paths()? > This isn't a problem for commit-graph right now, but: > > - it actually could be using a single diff_options, which would be > slightly simpler (it doesn't seem to save much CPU, though, because > the initialization is relatively cheap) > > - it's a bit of a subtle bug to leave hanging around for the next > person who tries to use the feature > > I actually wonder if this could be rolled into the has_changes and > diff_can_quit_early() feature. This really just a generalization of that > feature (which is like setting max_changes to "1"). I thought about this at first, but it only takes a struct diff_options right now. It does have an internally-mutated member (flags.has_changes) but it also seems a bit wrong to add a uint32_t of the count in this. Changing the prototype could be messy, too. There are also multiple callers, and limiting everything to tree-diff.c limits the impact. Thanks, -Stolee