On Fri, Jun 28, 2019 at 08:35:28AM -0400, Derrick Stolee wrote: > > + while test "$total" -gt 0 > > + do > > + echo "commit $ref" && > > + printf 'author %s <%s> %s\n' \ > > + "$GIT_AUTHOR_NAME" \ > > + "$GIT_AUTHOR_EMAIL" \ > > + "$cur_time -0700" && > > + printf 'committer %s <%s> %s\n' \ > > + "$GIT_COMMITTER_NAME" \ > > + "$GIT_COMMITTER_EMAIL" \ > > + "$cur_time -0700" && > > + echo "data <<EOF" && > > + eval "echo \"$message\"" && > > + echo "EOF" && > > + eval "echo \"M 644 inline $filename\"" && > > + echo "data <<EOF" && > > + eval "echo \"$contents\"" && > > + echo "EOF" && > > + echo && > > + n=$((n + 1)) && > > + cur_time=$((cur_time + 1)) && > > + total=$((total - 1)) || > > + echo "poison fast-import stream" > > + done > > I am not very good at the nitty-gritty details of our scripts, but > looking at this I wonder if there is a cleaner and possibly faster > way to do this loop. The top thing on my mind are the 'eval "echo X"' > lines. If they start processes, then we can improve the performance. > If not, then it may not be worth it. No, evals by themselves don't require a process. That whole loop should all happen as a single process (because it's the left-hand side of the pipe, it does require a subshell). We could drop even that process by writing into a temporary file. The size probably wouldn't be a big deal, and I doubt the latency would even matter much (and anyway, when you're running the tests in parallel anyway, CPU time is the most important metric). It might also make the code a little simpler, since we'd be running in the main shell and could just use test_tick naturally (rather than the manual addition hackery). I'll take a look. I wasn't super concerned with eliminating processes here as long as the number of them is constant with respect to the number of commits we're generating. The big improvement is taking, say, 300 test_commit calls and turning it into a single bulk call. Replacing a single-commit test_commit with this would be break-even at best. > In wonder if instead we could create some format string outside the > loop and then pass the values that change between iterations into > that format string. The evals should be fast. But they are potentially error-prone, since callers have to pass something like --message='commit $n' with single quotes to keep the "$" intact. But because all of our test snippets are inside single-quotes already, you end up with: test_bulk_commit --message="commit \$n" (though in practice most of the callers used the --id shorthand, which neatly sidesteps this). Since there's literally only one variable to interpolate, we could swap this out for using printf formatters, and letting "%s" mean the same as "$n". It should perform the same but is a bit less magical and a bit harder to screw up. It would also be easier to handle if test_commit_bulk eventually became C code. The only downside I can think of is that you can't mention "%s" twice, but I find it hard to imagine a caller would want that anyway. So I'll also take a look at that. -Peff