Re: [PATCH v2] unpack-trees: avoid duplicate ODB lookups during checkout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 4/10/2017 4:55 PM, Jeff King wrote:
On Sat, Apr 08, 2017 at 04:06:41PM +0200, René Scharfe wrote:

+		} else if (i > 1 && are_same_oid(&names[i], &names[i-2])) {
+			/* implicitly borrow buf[i-2] inside tree_desc[i] */
+			memcpy(&t[i], &t[i-2], sizeof(struct tree_desc));

Similar case.

+			buf[i] = NULL;

What makes the previous two entries special, or differently put: Why not
just check *all* previous entries?  MAX_UNPACK_TREES is 8; the number of
comparisons would just about double in the worst case and stay the same for
three trees or less.  The order of four trees or more wouldn't matter
anymore.

If I understand correctly, we've got N (up to 8) trees, and we want to
find sets of duplicates. Adjacency doesn't matter. Aren't our options
basically:

  - look through all previous entries naively, O(N^2)

  - sort the list, O(N log N); but we need the original order, so we'd
    have to unsort at the end, too

  - use a hash table, O(N) but with O(N) extra storage

I know N=8 isn't very big algorithmically speaking, but it would feel
ugly to introduce something quadratic. Especially the limit of 8 seems
rather arbitrary. In normal cases we see at most a 3-way merge. Beyond
that we're in octopus territory, and 8 is way too low there (I think the
octopus strategy just unpacks one at a time and barfs if there are any
conflicts).

I assume the rationale behind "check the last 2" is just "this
optimization will kick in reliably for all sane cases", weird octopus
unpack-trees be damned (though reading ca885a4fe, it sounds like 4-trees
can actually happen, but I'm not clear on how).

yeah, i didn't want to pay for the obscure (n >= 4) cases with any
of the above.  doing 2 or 3 gets us checkout and merge.  these are
the common usages.

Jeff




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]