On Wed, Oct 10 2018, Ævar Arnfjörð Bjarmason wrote: > On Wed, Oct 10 2018, SZEDER Gábor wrote: > >> On Wed, Oct 10, 2018 at 11:56:45PM +0200, Ævar Arnfjörð Bjarmason wrote: >>> On Wed, Oct 10 2018, SZEDER Gábor wrote: >> >>> >> for (i = 0; i < oids->nr; i++) { >>> >> + display_progress(progress, ++j); >>> >>> [...] >>> >>> > This display_progress() call, however, doesn't seem to be necessary. >>> > First, it counts all commits for a second time, resulting in the ~2x >>> > difference compared to the actual number of commits, and then causing >>> > my confusion. Second, all what this loop is doing is setting a flag >>> > in commits that were already looked up and parsed in the above loops. >>> > IOW this loop is very fast, and the progress indicator jumps from >>> > ~780k right to 1.5M, even on my tiny laptop, so it doesn't need a >>> > progress indicator at all. >>> >>> You're right, I tried this patch on top: >> >> [...] >> >>> And on a large repo with around 3 million commits the 3rd progress bar >>> doesn't kick in. >>> >>> But if I apply this on top: >>> >> [...] >>> >>> I.e. start the timer after 1/4 of a second instead of 1 second, I get >>> that progress bar. >>> >>> So I'm inclined to keep it. It just needs to be 4x the size before it's >>> noticeably hanging for 1 second. >> >> Just to clarify: are you worried about a 1 second hang in an approx. 12 >> million commit repository? If so, then I'm unconvinced, that's not >> even a blip on the radar, and the misleading numbers are far worse. > > It's not a blip on the runtime, but the point of these progress bars in > general is so we don't have a UI where there's no UI differnce between > git hanging and just doing work in some tight loop in the background, > and even 1 second when you're watching something is noticeable if it > stalls. > > Also it's 1 second on a server where I had 128G of RAM. I think even a > "trivial" flag change like this would very much change if e.g. the > system was under memory pressure or was swapping. > > And as noted code like this tends to change over time, that loop might > get more expensive, so let's future proof by having all the loops over N > call the progress code. > > When I wrote this the intent was just "report progress". So that it's > counting anything is just an implementation detail of how progress.c > works now. > > This was the reference to Duy's patch, i.e. instead of spewing numbers > at the user here let's just render a spinner. Then we no longer need to > make judgement calls about which loop over N is expensive right now, and > which one isn't, and if any of them will result in reporting a 2N number > while the user might be more familiar with or expecting N. > >>> That repo isn't all that big compared to what we've heard about out >>> there, and inner loops like this have a tendency to accumulate some more >>> code over time without a re-visit of why we weren't monitoring progress >>> there. >>> >>> But maybe we can fix the message. We say "Annotating commits in commit >>> grap", not "Counting" or whatever. I was trying to find something that >>> didn't imply that we were doing this once. One can annotate a thing more >>> than once, but maybe ther's a better way to explain this... >> >> IMO just remove it. Hrm, actually reading this again your initial post says we end up with a 2x difference v.s. the number of commits, but it's actually 3x. The loop that has a rather trivial runtime comparatively is the 3x, but the 2x loop takes a notable amount of time. So e.g. on git.git: $ git rev-list --all | wc -l; ~/g/git/git commit-graph write 166678 Annotating commits in commit graph: 518463, done. Computing commit graph generation numbers: 100% (172685/172685), done.