On Thu, Mar 22, 2018 at 10:46:09PM -0400, Jeff King wrote: > > which begs the question, how much slower would it be if we > > replaced the radix-sort with an in-place sort (e.g. heapsort). > > > > I hacked up the patch below, just for fun. I don't have any > > large repos (or enough disk space) to do any meaningful perf > > tests, but I did at least compile it and it passes the test-suite. > > (That is no guarantee that I haven't introduced bugs, of course!) > > It might have been easier to just revert 8b8dfd5132 (pack-revindex: > radix-sort the revindex, 2013-07-11). It even includes some performance > numbers. :) > > In short, no, I don't think we want to go back to a comparison-sort. The > radix sort back then was around 4 times faster for linux.git. And that > was when there were half as many objects in the repository, so the radix > sort should continue to improve as the repo size grows. I was curious whether my hand-waving there was true. It turns out that it is: the radix sort has stayed about the same speed but the comparison sort has gotten even slower. Here are best-of-five timings for "git cat-file --batch-check='%(objectsize:disk)'", which does very little besides generate the rev-index: [current master, using radix sort] real 0m0.104s user 0m0.088s sys 0m0.016s [reverting 8b8dfd5132, going back to qsort] real 0m1.193s user 0m1.176s sys 0m0.016s So it's now a factor of 11. Yikes. That number does match some napkin math. The radix sort uses four 16-bit buckets, but can quit when after two rounds (because none of the offsets is beyond 2^32). So it's essentially O(2n). Whereas the comparison sort is O(n log n), and with n around 6M, that puts log(n) right around 22. It's possible that some other comparison-based sort might be a little more efficient than qsort, but I don't think you'll be able to beat the algorithmic speedup. The revert of 8b8dfd5132 is below for reference (it needed a few conflict tweaks). -Peff --- diff --git a/pack-revindex.c b/pack-revindex.c index ff5f62c033..c20aa9541b 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -15,102 +15,11 @@ * get the object sha1 from the main index. */ -/* - * This is a least-significant-digit radix sort. - * - * It sorts each of the "n" items in "entries" by its offset field. The "max" - * parameter must be at least as large as the largest offset in the array, - * and lets us quit the sort early. - */ -static void sort_revindex(struct revindex_entry *entries, unsigned n, off_t max) +static int cmp_offset(const void *a_, const void *b_) { - /* - * We use a "digit" size of 16 bits. That keeps our memory - * usage reasonable, and we can generally (for a 4G or smaller - * packfile) quit after two rounds of radix-sorting. - */ -#define DIGIT_SIZE (16) -#define BUCKETS (1 << DIGIT_SIZE) - /* - * We want to know the bucket that a[i] will go into when we are using - * the digit that is N bits from the (least significant) end. - */ -#define BUCKET_FOR(a, i, bits) (((a)[(i)].offset >> (bits)) & (BUCKETS-1)) - - /* - * We need O(n) temporary storage. Rather than do an extra copy of the - * partial results into "entries", we sort back and forth between the - * real array and temporary storage. In each iteration of the loop, we - * keep track of them with alias pointers, always sorting from "from" - * to "to". - */ - struct revindex_entry *tmp, *from, *to; - int bits; - unsigned *pos; - - ALLOC_ARRAY(pos, BUCKETS); - ALLOC_ARRAY(tmp, n); - from = entries; - to = tmp; - - /* - * If (max >> bits) is zero, then we know that the radix digit we are - * on (and any higher) will be zero for all entries, and our loop will - * be a no-op, as everybody lands in the same zero-th bucket. - */ - for (bits = 0; max >> bits; bits += DIGIT_SIZE) { - unsigned i; - - memset(pos, 0, BUCKETS * sizeof(*pos)); - - /* - * We want pos[i] to store the index of the last element that - * will go in bucket "i" (actually one past the last element). - * To do this, we first count the items that will go in each - * bucket, which gives us a relative offset from the last - * bucket. We can then cumulatively add the index from the - * previous bucket to get the true index. - */ - for (i = 0; i < n; i++) - pos[BUCKET_FOR(from, i, bits)]++; - for (i = 1; i < BUCKETS; i++) - pos[i] += pos[i-1]; - - /* - * Now we can drop the elements into their correct buckets (in - * our temporary array). We iterate the pos counter backwards - * to avoid using an extra index to count up. And since we are - * going backwards there, we must also go backwards through the - * array itself, to keep the sort stable. - * - * Note that we use an unsigned iterator to make sure we can - * handle 2^32-1 objects, even on a 32-bit system. But this - * means we cannot use the more obvious "i >= 0" loop condition - * for counting backwards, and must instead check for - * wrap-around with UINT_MAX. - */ - for (i = n - 1; i != UINT_MAX; i--) - to[--pos[BUCKET_FOR(from, i, bits)]] = from[i]; - - /* - * Now "to" contains the most sorted list, so we swap "from" and - * "to" for the next iteration. - */ - SWAP(from, to); - } - - /* - * If we ended with our data in the original array, great. If not, - * we have to move it back from the temporary storage. - */ - if (from != entries) - COPY_ARRAY(entries, tmp, n); - free(tmp); - free(pos); - -#undef BUCKET_FOR -#undef BUCKETS -#undef DIGIT_SIZE + const struct revindex_entry *a = a_; + const struct revindex_entry *b = b_; + return (a->offset < b->offset) ? -1 : (a->offset > b->offset) ? 1 : 0; } /* @@ -152,7 +61,7 @@ static void create_pack_revindex(struct packed_git *p) */ p->revindex[num_ent].offset = p->pack_size - 20; p->revindex[num_ent].nr = -1; - sort_revindex(p->revindex, num_ent, p->pack_size); + qsort(p->revindex, num_ent, sizeof(*p->revindex), cmp_offset); } void load_pack_revindex(struct packed_git *p)