On Tue, Jun 27, 2023 at 11:22:02AM -0700, Junio C Hamano wrote: > - populate_value() -> get_object() -> oid_object_info_extended() -> > grab_common_values() asks for ATOM_OBJECTTYPE that should be a > single xstrdup(), but oid_object_info_extended() needs to parse > the object, but .info.contentp ought to be NULL and we should > not be calling parse_object_buffer(). This is the atom that I expect to give most of the cost. We should definitely be computing that just with a pack lookup (and chasing deltas through the pack, though commits usually aren't deltas). But without that atom, I think we would not even mmap the idx or pack file at all. The ref-filter code is also pretty keen to malloc() little strings. When we're measuring 5ms total runtimes and there's just not that much actual work to do, those little things start to matter more. > So it might be worth looking into where the time is going (didn't > Peff or somebody do that a few years ago, though?) when using the > default format and optimize or special case the codepath,[...] Running: perf record -g git for-each-ref --format='%(refname) %(objecttype)' >/dev/null perf script report flamegraph in linux.git shows that we spend a lot of time faulting in .idx pages. A lot of the time is attributed to ref_array_sort(), but I think that's a red herring. It lazy-loads the atom values, so that's where most of the work is happening (though I would have thought it would just do the refname atom, not all of them; but I guess it doesn't really matter since we'll eventually need them to print anyway). Commenting out the sort call just shows the same time spent in format_ref_array_item(). The patches you're thinking of are probably: https://lore.kernel.org/git/YTNpQ7Od1U%2F5i0R7@xxxxxxxxxxxxxxxxxxxxxxx/ which are mostly about micro-optimizing out those mallocs. They do still help a little in the '%(refname)' case, but not nearly as much as dropping '%(objecttype)' does. (Using --format=x in theory drops out some mallocs, but I wasn't able to measure any actual speedup). > [...]but the > responses I have seen from two of you makes me suspect that the new > option is not the best general approach. Yeah, my feeling is that with the reduced format, eliminating the pipe/wc overhead is a pretty small improvement. -Peff