Jacob Vosmaer <jacob@xxxxxxxxxx> writes: > In both protocol v0 and v2, upload-pack writes one pktline packet per > advertised ref to stdout. That means one or two write(2) syscalls per > ref. This is problematic if these writes become network sends with > high overhead. > > This commit changes both send_ref callbacks to use buffered IO using > stdio. The default buffer size is usually 4KB, but with musl libc it > is only 1KB. So for consistent results we need to set a buffer size > ourselves. During startup of upload-pack, we set the stdout buffer > size to LARGE_PACKET_MAX, i.e. just shy of 64KB. > > To give an example of the impact: I set up a single-threaded loop that > calls ls-remote (with HTTP and protocol v2) on a local GitLab > instance, on a repository with 11K refs. When I switch from Git > v2.32.0 to this patch, I see a 50% reduction in CPU time for Git, and > 75% for Gitaly (GitLab's Git RPC service). > > So using buffered IO not only saves syscalls in upload-pack, it also > saves time in things that consume upload-pack's output. > > Signed-off-by: Jacob Vosmaer <jacob@xxxxxxxxxx> > --- Nicely explained. > + /* > + * Increase the stdio buffer size for stdout, for the benefit of ref > + * advertisement writes. We are only allowed to call setvbuf(3) "after > + * opening a stream and before any other operations have been performed > + * on it", so let's call it before we have written anything to stdout. > + */ > + if (setvbuf(stdout, xmalloc(LARGE_PACKET_MAX), _IOFBF, > + LARGE_PACKET_MAX)) > + die_errno("failed to grow stdout buffer"); Nice to see a comment on the tricky part. I do not think we mind if we rounded up the allocation size to the next power of two here, but there probably won't be any measurable benefit for doing so. > diff --git a/ls-refs.c b/ls-refs.c > index 88f6c3f60d..83f2948fc3 100644 > --- a/ls-refs.c > +++ b/ls-refs.c > @@ -105,7 +105,7 @@ static int send_ref(const char *refname, const struct object_id *oid, > } > > strbuf_addch(&refline, '\n'); > - packet_write(1, refline.buf, refline.len); > + packet_fwrite(stdout, refline.buf, refline.len); > > strbuf_release(&refline); > return 0; > @@ -171,6 +171,9 @@ int ls_refs(struct repository *r, struct strvec *keys, > strvec_push(&data.prefixes, ""); > for_each_fullref_in_prefixes(get_git_namespace(), data.prefixes.v, > send_ref, &data, 0); > + /* Call fflush because send_ref uses stdio. */ > + if (fflush(stdout)) > + die_errno(_("write failure on standard output")); There is maybe_flush_or_die() helper that does a bit too much (I do not think this code path needs to worry about GIT_FLUSH) but does call check_pipe(errno) like packet_write_fmt() does upon seeing a write failure. > packet_flush(1); I briefly wondered if all the operations on refline can be turned into direct operations on stdout, but the answer obviously is no. We still need to have a strbuf to assemble a single packet and measure its final length before we can send it out to stdout. > diff --git a/upload-pack.c b/upload-pack.c > index 297b76fcb4..b592ac6cfb 100644 > --- a/upload-pack.c > +++ b/upload-pack.c > @@ -58,6 +58,7 @@ enum allow_uor { > */ > struct upload_pack_data { > struct string_list symref; /* v0 only */ > + struct strbuf send_ref_buf; /* v0 only */ > struct object_array want_obj; > struct object_array have_obj; > struct oid_array haves; /* v2 only */ > @@ -126,6 +127,7 @@ static void upload_pack_data_init(struct upload_pack_data *data) > struct string_list uri_protocols = STRING_LIST_INIT_DUP; > struct object_array extra_edge_obj = OBJECT_ARRAY_INIT; > struct string_list allowed_filters = STRING_LIST_INIT_DUP; > + struct strbuf send_ref_buf = STRBUF_INIT; IIUC, the most notable difference between this round and the previous one is that now we are no longer buffering more than one packet worth of data because we let the stdio to accumulate them. I was a bit surprised that we still want to have a strbuf inside this structure (which is there only because it wants to persist during the whole conversation with the other side). Ahh, sorry, scratch that. I do remember the discussion/patch that it was hurting to make calls to strbuf-init/strbuf-release once per ref, and it is an easy way to have the structure here to reuse it. OK. This step looks quite sensible, other than the same "do we want check_pipe() before dying upon fflush() failure?" we see in the last hunk below. I didn't mention this in the review of 1/2, but the new fwrite_or_die() helper function may also have the same issue. Thanks. > @@ -1348,6 +1354,9 @@ void upload_pack(struct upload_pack_options *options) > reset_timeout(data.timeout); > head_ref_namespaced(send_ref, &data); > for_each_namespaced_ref(send_ref, &data); > + /* Call fflush because send_ref uses stdio. */ > + if (fflush(stdout)) > + die_errno(_("write failure on standard output")); > advertise_shallow_grafts(1); > packet_flush(1); > } else {