Am 05.04.2015 um 03:06 schrieb Jeff King: > As I've mentioned before, I have some repositories with rather large > numbers of refs. The worst one has ~13 million refs, for a 1.6GB > packed-refs file. So I was saddened by this: > > $ time git.v2.0.0 rev-parse refs/heads/foo >/dev/null 2>&1 > real 0m6.840s > user 0m6.404s > sys 0m0.440s > > $ time git.v2.4.0-rc1 rev-parse refs/heads/foo >/dev/null 2>&1 > real 0m19.432s > user 0m18.996s > sys 0m0.456s > > The command isn't important; what I'm really measuring is loading the > packed-refs file. And yes, of course this repository is absolutely > ridiculous. But the slowdowns here are linear with the number of refs. > So _every_ git command got a little bit slower, even in less crazy > repositories. We just didn't notice it as much. > > Here are the numbers after this series: > > real 0m8.539s > user 0m8.052s > sys 0m0.496s > > Much better, but I'm frustrated that they are still 20% slower than the > original. > > The main culprits seem to be d0f810f (which introduced some extra > expensive code for each ref) and my 10c497a, which switched from fgets() > to strbuf_getwholeline. It turns out that strbuf_getwholeline is really > slow. 10c497a changed read_packed_refs(), which reads *all* packed refs. Each is checked for validity. That sounds expensive if the goal is just to look up a single (non-existing) ref. Would it help to defer any checks until a ref is actually accessed? Can a binary search be used instead of reading the whole file? I wonder if pluggable reference backends could help here. Storing refs in a database table indexed by refname should simplify things. Short-term, can we avoid the getc()/strbuf_grow() dance e.g. by mapping the packed refs file? What numbers do you get with the following patch? --- refs.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/refs.c b/refs.c index 47e4e53..144255f 100644 --- a/refs.c +++ b/refs.c @@ -1153,16 +1153,35 @@ static const char *parse_ref_line(struct strbuf *line, unsigned char *sha1) * compatibility with older clients, but we do not require it * (i.e., "peeled" is a no-op if "fully-peeled" is set). */ -static void read_packed_refs(FILE *f, struct ref_dir *dir) +static void read_packed_refs(int fd, struct ref_dir *dir) { struct ref_entry *last = NULL; struct strbuf line = STRBUF_INIT; enum { PEELED_NONE, PEELED_TAGS, PEELED_FULLY } peeled = PEELED_NONE; + struct stat st; + void *map; + size_t mapsz, len; + const char *p; + + fstat(fd, &st); + mapsz = xsize_t(st.st_size); + if (!mapsz) + return; + map = xmmap(NULL, mapsz, PROT_READ, MAP_PRIVATE, fd, 0); - while (strbuf_getwholeline(&line, f, '\n') != EOF) { + for (p = map, len = mapsz; len; ) { unsigned char sha1[20]; const char *refname; const char *traits; + const char *nl; + size_t linelen; + + nl = memchr(p, '\n', len); + linelen = nl ? nl - p + 1 : len; + strbuf_reset(&line); + strbuf_add(&line, p, linelen); + p += linelen; + len -= linelen; if (skip_prefix(line.buf, "# pack-refs with:", &traits)) { if (strstr(traits, " fully-peeled ")) @@ -1204,6 +1223,7 @@ static void read_packed_refs(FILE *f, struct ref_dir *dir) } strbuf_release(&line); + munmap(map, mapsz); } /* @@ -1224,16 +1244,16 @@ static struct packed_ref_cache *get_packed_ref_cache(struct ref_cache *refs) clear_packed_ref_cache(refs); if (!refs->packed) { - FILE *f; + int fd; refs->packed = xcalloc(1, sizeof(*refs->packed)); acquire_packed_ref_cache(refs->packed); refs->packed->root = create_dir_entry(refs, "", 0, 0); - f = fopen(packed_refs_file, "r"); - if (f) { - stat_validity_update(&refs->packed->validity, fileno(f)); - read_packed_refs(f, get_ref_dir(refs->packed->root)); - fclose(f); + fd = open(packed_refs_file, O_RDONLY); + if (fd >= 0) { + stat_validity_update(&refs->packed->validity, fd); + read_packed_refs(fd, get_ref_dir(refs->packed->root)); + close(fd); } } return refs->packed; -- 2.3.5 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html