Re: [PATCH 1/1] upload-pack.c: make output buffer size configurable

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Dec 13, 2021 at 02:23:45PM +0100, Jacob Vosmaer wrote:

> The current size of the buffer is 8192+1. The '+1' is a technical
> detail orthogonal to this change. On GitLab.com we use GitLab's
> pack-objects cache which does writes of 65515 bytes. Because of the
> default 8KB buffer size, propagating these cache writes requires 8
> pipe reads and 8 pipe writes from git-upload-pack, and 8 pipe reads
> from Gitaly (our Git RPC service). If we increase the size of the
> buffer to the maximum Git packet size, we need only 1 pipe read and 1
> pipe write in git-upload-pack, and 1 pipe read in Gitaly to transfer
> the same amount of data. In benchmarks with a pure fetch and 100%
> cache hit rate workload we are seeing CPU utilization reductions of
> over 30%.

I was curious to reproduce this locally, so I came up with:

  (
    printf "003fwant %s side-band-64k" $(git rev-parse HEAD)
    printf 0000
    echo '0009done'
  ) |
  git -c uploadpack.packobjectsHook='cat objects/pack/pack-*.pack; :' upload-pack . |
  pv >/dev/null

which hackily simulates the server side of your "cached" case. :) I ran
it on a fully-packed clone of linux.git.

It gets about 2.3GB/s with the tip of 'master' and 3.2GB/s with the
equivalent of your patch (using LARGE_PACKET_DATA_MAX). So definitely an
improvement.

Without the cached case (so actually running pack-objects, albeit a
pretty quick one because of bitmaps and pack-reuse), the timings are
about the same (171MB/s versus 174MB/s, but really it's just pegging a
CPU running pack-objects). So it would be fine to just do this
unconditionally, I think.

Looking at strace, the other thing I notice is that we write() the
packet header separately in send_sideband(), which doubles the number of
syscalls. I hackily re-wrote this to use writev() instead (patch below),
but it doesn't seem to actually help much (maybe a curiosity to explore
further, but definitely not something to hold up your patch).

-Peff

---
diff --git a/sideband.c b/sideband.c
index 85bddfdcd4..d0945507a3 100644
--- a/sideband.c
+++ b/sideband.c
@@ -5,6 +5,11 @@
 #include "help.h"
 #include "pkt-line.h"
 
+/* hack; should go in git-compat-util, and should provide compat
+ * wrapper around regular write()
+ */
+#include <sys/uio.h>
+
 struct keyword_entry {
 	/*
 	 * We use keyword as config key so it should be a single alphanumeric word.
@@ -257,6 +262,7 @@ void send_sideband(int fd, int band, const char *data, ssize_t sz, int packet_ma
 
 	while (sz) {
 		unsigned n;
+		struct iovec iov[2];
 		char hdr[5];
 
 		n = sz;
@@ -265,12 +271,17 @@ void send_sideband(int fd, int band, const char *data, ssize_t sz, int packet_ma
 		if (0 <= band) {
 			xsnprintf(hdr, sizeof(hdr), "%04x", n + 5);
 			hdr[4] = band;
-			write_or_die(fd, hdr, 5);
+			iov[0].iov_base = hdr;
+			iov[0].iov_len = 5;
 		} else {
 			xsnprintf(hdr, sizeof(hdr), "%04x", n + 4);
-			write_or_die(fd, hdr, 4);
+			iov[0].iov_base = hdr;
+			iov[0].iov_len = 4;
 		}
-		write_or_die(fd, p, n);
+		iov[1].iov_base = (void *)p;
+		iov[1].iov_len = n;
+		/* should check for errors, but also short writes and EINTR, etc */
+		writev(fd, iov, 2);
 		p += n;
 		sz -= n;
 	}
diff --git a/upload-pack.c b/upload-pack.c
index c78d55bc67..111de8c60c 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -194,7 +194,7 @@ static int write_one_shallow(const struct commit_graft *graft, void *cb_data)
 }
 
 struct output_state {
-	char buffer[8193];
+	char buffer[LARGE_PACKET_DATA_MAX];
 	int used;
 	unsigned packfile_uris_started : 1;
 	unsigned packfile_started : 1;



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux