I am currently looking what can be done to speed up deltaing. The following code can be found here: for (i = 0; i < hsize; i++) { if (hash_count[i] < HASH_LIMIT) continue; entry = hash[i]; do { struct index_entry *keep = entry; int skip = hash_count[i] / HASH_LIMIT / 2; do { entry = entry->next; } while(--skip && entry); keep->next = entry; } while(entry); } If I analyze what happens for various values of hash_count[i], I get the following (the first case is by far the worst): HASH_LIMIT <= hash_count[i] < 2*HASH_LIMIT: skip = 0; do .. while terminates with negative skip and entry == 0, keep->next is set to 0 -> all hashes except the first one get dropped. Result is new_hash_count = 1. Ugh, ugh, ugh. 2*HASH_LIMIT <= hash_count[i] < 4*HASH_LIMIT skip = 1; do .. while does one iteration, every second value is skipped, result is that HASH_LIMIT <= new_hash_count < 2*HASH_LIMIT 4*HASH_LIMIT <= hash_limit[i] < 6*HASH_LIMIT skip = 2; do .. while does two iterations, two of three values are skipped, result is that 4*HASH_LIMIT/3 <= new_hash_count < 2*HASH_LIMIT And so on. It would appear that if HASH_LIMIT is supposed to do what it is seemingly intended for, the skip calculation has to be just int skip = hash_count[i] / HASH_LIMIT; Otherwise, there is completely broken behavior for values between HASH_LIMIT and 2*HASH_LIMIT (where only a single hash survives), and for larger values, the limit will be 2*HASH_LIMIT rather than HASH_LIMIT as was probably intended. -- David Kastrup, Kriemhildstr. 15, 44793 Bochum - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html