On Tue, Feb 09, 2021 at 01:14:17PM -0800, Junio C Hamano wrote: > Jeff King <peff@xxxxxxxx> writes: > > > I don't know that it's really worth digging into that much, though it's > > quite possible there may be some easy wins by optimizing those memcpy > > calls. E.g., I'm not sure if the compiler ends up inlining them or not. > > If it doesn't realize that the_hash_algo->rawsz is only ever "20" or > > "32", we could perhaps help it along with specialized versions of > > hashcpy(). If somebody does want to play with it, this patch may make a > > good testbed. :) > > Yuck. That reminds me of the adventure Shawn he made in the Java > land benchmarking which one among int[5], int a,b,c,d,e, char[40] is > the most efficient way (both storage-wise and performance-wise) to > store SHA-1 hash. I wish we didn't have to go there. > > It indeed is an interesting, despite a bit sad, observation that > even with a good precomputed information, an overly heavy interface > can kill potential performance benefit. Agreed. But I'm hoping we can continue to mostly ignore it. I suspect this finding means we are wasting a few hundred milliseconds copying oids around during a clone of torvalds/linux. But overall that is a pretty heavy-weight operation, and I doubt anybody really notices. And for something as lightweight as --disk-usage, it was easy enough to optimize around it. It probably does have a more measurable impact in something like: git rev-list --use-bitmap-index --objects HEAD >/dev/null where we really do need those oids, and the extra copying might add up. I guess if somebody is interested in micro-optimizing, that is probably a good command to look at. -Peff