[PATCH 6/9] strbuf_getwholeline: avoid calling strbuf_grow

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



As with the recent speedup to strbuf_addch, we can avoid
calling strbuf_grow() in a tight loop of single-character
adds by instead checking strbuf_avail.

Note that we would instead call strbuf_addch directly here,
but it does more work than necessary: it will NUL-terminate
the result for each character read. Instead, in this loop we
read the characters one by one and then add the terminator
manually at the end.

Running "git rev-parse refs/heads/does-not-exist" on a repo
with an extremely large (1.6GB) packed-refs file went from
(best-of-5):

  real    0m10.948s
  user    0m10.548s
  sys     0m0.412s

to:

  real    0m8.601s
  user    0m8.084s
  sys     0m0.524s

for a wall-clock speedup of 21%.

Helped-by: Eric Sunshine <sunshine@xxxxxxxxxxxxxx>
Signed-off-by: Jeff King <peff@xxxxxxxx>
---
Our "don't write a NUL for each character" optimization is only possible
because we're intimate with the strbuf details here. I thought about
making a strbuf_addch_unsafe interface to let other callers do this,
too. But the only other caller that would use it is the config reader,
and I measured only a 3% speedup there. Which I don't think is worth the
extra API complexity.

Whereas here it does make a big difference. Switching to strbuf_addch
knocks us back up into the 9.5s range. I think the difference is that
our lines are much longer than the tokens we're parsing in the config
file. So the percentage of wasted NUL writes is much higher here.

 strbuf.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/strbuf.c b/strbuf.c
index af2bad4..921619e 100644
--- a/strbuf.c
+++ b/strbuf.c
@@ -445,7 +445,8 @@ int strbuf_getwholeline(struct strbuf *sb, FILE *fp, int term)
 	strbuf_reset(sb);
 	flockfile(fp);
 	while ((ch = getc_unlocked(fp)) != EOF) {
-		strbuf_grow(sb, 1);
+		if (!strbuf_avail(sb))
+			strbuf_grow(sb, 1);
 		sb->buf[sb->len++] = ch;
 		if (ch == term)
 			break;
-- 
2.4.0.rc2.384.g7297a4a

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]