On Mon, Aug 19, 2019 at 12:12:32PM -0400, Derrick Stolee wrote: > On 8/19/2019 11:02 AM, SZEDER Gábor wrote: > > On Mon, Aug 19, 2019 at 10:50:48AM -0400, Derrick Stolee wrote: > >> Note that I don't include the "without patch" numbers. For some > >> reason the path provided is particularly nasty and caused 20,000+ > >> missing blobs to be downloaded one-by-one (remember: VFS for Git > >> has many partial-clone-like behaviors). I canceled my test after 55 > >> minutes. I'll dig in more to see what is going on, since only 37 > >> commits actually change that path. > > > > Don't bother digging into it, I know why it happens (and how to avoid > > it! :), but don't have the time right now to explain. > > Great! I look forward to your explanation and fix later. > > Just to be sure we've got the same issue, here is a section of the > call stack in the hot portion: > > line_log_filter > + queue_diffs > + diffcore_std > + diffcore_rename > + diff_populate_filespec Hmm, interesting. Certainly most of the time is wasted in queue_diffs(), but in my perf measurements diff_tree_oid() is responsible for most of that, not diffcore_std(). This might still be the same issue, though, but perhaps VFS for Git shifts the balance. We'll see, here are my patches to address that diff_tree_oid() slowness: https://public-inbox.org/git/20190821110424.18184-1-szeder.dev@xxxxxxxxx/T/