On Tue, Aug 31, 2021 at 12:25 PM Jeff King <peff@xxxxxxxx> wrote: > I do think it would be nice to take the packet_writer > interface further (letting it replace the static buf, and use stdio > handles, and using it throughout upload-pack). I would like that too, for the sake of neatness and general performance, but I don't have the time to take on a larger project like that at the moment. > Does the 64k buffer actually improve things? Here are the timings I get > on a repo with ~1M refs (it's linux.git with one ref per commit). Thanks for challenging that. I have a repeatable benchmark where it matters, because each write syscall wakes up a chain of proxies between the user and git-upload-pack. Larger buffers means fewer wake-ups. But then I tried to simplify my example by having sshd as the only intermediary, and in that experiment 64K buffers were not better than 4K buffers. I think that goes to show that picking a good buffer size is hard, and we'd be better off picking one specifically for Gitaly (and GitLab) that works with our stack. > Summary > 'GIT_REF_PARANOIA=1 git.compile upload-pack .' ran > 2.17 ± 0.02 times faster than 'git.compile upload-pack .' > > It's not exactly the intended use of that environment variable, but its > side effect is that we do not call has_object_file() on each ref tip. That is nice to know, but as a user of Git I don't know when it is or is not safe to skip those has_object_file() calls. If it's safe to skip them then Git should skip them always. If not, then I will err on the side of caution and keep the checks. Jacob