Derrick Stolee <stolee@xxxxxxxxx> writes: > +char* get_commit_graph_filename_hash(const char *pack_dir, Asterisk sticks to the identifier, not type, in our codebase. > + struct object_id *hash) > +{ > + size_t len; > + struct strbuf path = STRBUF_INIT; > + strbuf_addstr(&path, pack_dir); > + strbuf_addstr(&path, "/graph-"); > + strbuf_addstr(&path, oid_to_hex(hash)); > + strbuf_addstr(&path, ".graph"); Use of strbuf_addf() would make it easier to read and maintain, no? > + > + return strbuf_detach(&path, &len); > +} > + > +static void write_graph_chunk_fanout(struct sha1file *f, > + struct commit **commits, > + int nr_commits) > +{ > + uint32_t i, count = 0; > + struct commit **list = commits; > + struct commit **last = commits + nr_commits; > + > + /* > + * Write the first-level table (the list is sorted, > + * but we use a 256-entry lookup to be able to avoid > + * having to do eight extra binary search iterations). > + */ > + for (i = 0; i < 256; i++) { > + while (list < last) { > + if ((*list)->object.oid.hash[0] != i) > + break; > + count++; > + list++; > + } If count and list are always incremented in unison, perhaps you do not need an extra variable "last". If typeof(nr_commits) is wider than typeof(count), this loop and the next write-be32 is screwed anyway ;-) This comment probably applies equally to some other uses of the same "compute last pointer to compare with running pointer for termination" pattern in this patch. > + sha1write_be32(f, count); > + } > +} > +static int if_packed_commit_add_to_list(const struct object_id *oid, That is a strange name. "collect packed commits", perhaps? > + struct packed_git *pack, > + uint32_t pos, > + void *data) > +{ > + struct packed_oid_list *list = (struct packed_oid_list*)data; > + enum object_type type; > + unsigned long size; > + void *inner_data; > + off_t offset = nth_packed_object_offset(pack, pos); > + inner_data = unpack_entry(pack, offset, &type, &size); > + > + if (inner_data) > + free(inner_data); > + > + if (type != OBJ_COMMIT) > + return 0; > + > + ALLOC_GROW(list->list, list->nr + 1, list->alloc); This probably will become inefficient in large repositories. You know you'll be walking all the pack files, and total number of objects in a packfile can be read cheaply, so it may make sense to make a rough guestimate of the number of commits (e.g. 15-25% of the total number of objects) in the repository and allocate the list upfront, instead of a hardcoded 1024.