Re: [PATCH v3 2/4] unpack-trees: optimize walking same trees with cache-tree

Elijah Newren <newren@xxxxxxxxx> · Fri, 10 Aug 2018 11:48:33 -0700

On Fri, Aug 10, 2018 at 9:29 AM Duy Nguyen <pclouds@xxxxxxxxx> wrote:
> On Wed, Aug 8, 2018 at 8:23 PM Elijah Newren <newren@xxxxxxxxx> wrote:

> > > +        * cache-tree would be invalidated and we would never get here
> > > +        * in the first place.
> > > +        */
> > > +       for (i = 0; i < nr_entries; i++) {
> > > +               struct cache_entry *tree_ce;
> > > +               int len, rc;
> > > +
> > > +               src[0] = o->src_index->cache[pos + i];
> > > +
> > > +               len = ce_namelen(src[0]);
> > > +               tree_ce = xcalloc(1, cache_entry_size(len));
> > > +
> > > +               tree_ce->ce_mode = src[0]->ce_mode;
> > > +               tree_ce->ce_flags = create_ce_flags(0);
> > > +               tree_ce->ce_namelen = len;
> > > +               oidcpy(&tree_ce->oid, &src[0]->oid);
> > > +               memcpy(tree_ce->name, src[0]->name, len + 1);
> >
> > We do a bunch of work to setup tree_ce...
> >
> > > +               for (d = 1; d <= nr_names; d++)
> > > +                       src[d] = tree_ce;
> >
> > ...then we make nr_names copies of tree_ce (so that *way_merge or
> > bind_merge or oneway_diff or whatever will have the expected number of
> > entries).
> >
> > > +               rc = call_unpack_fn((const struct cache_entry * const *)src, o);
> >
> > ...then we call o->fn (via call_unpack_fn) to do various complicated
> > logic to figure out which tree_ce to use??  Isn't that just an
> > expensive way to recompute that what we currently have in the index is
> > what we want to keep there?
> >
> > Granted, a caller of this may have set o->fn to something other than
> > {one,two,three}way_merge (or bind_merge), and that function might have
> > important side effects...but it just seems annoying to have to do so
> > much work when for most uses we already know the entry in the index is
> > the one we already want.
>
> I'm not so sure about that. Which is why I keep it generic.
>
> > In fact, the only other thing in the
> > codebase that o->fn is now set to is oneway_diff, which I think is a
> > no-op when the two trees match.
> >
> > Would be nice if we could avoid all this, at least in the common cases
> > where o->fn is a function known to not have side effects.  Or did I
> > not read those functions closely enough and they do have important
> > side effects?
>
> In one of my earlier "how about this" attempts, I introduced fn_same
> [1] that can help achieve this without carving "known not to have side
> effects" in common code. Which I think is still a good direction to go
> if we want to optimize more aggressively. We could have something like
> this
>
> diff --git a/unpack-trees.c b/unpack-trees.c
> index 1f11991a51..01b80389e0 100644
> --- a/unpack-trees.c
> +++ b/unpack-trees.c
> @@ -699,6 +699,9 @@ static int traverse_by_cache_tree(int pos, int
> nr_entries, int nr_names,
>         int ce_len = 0;
>         int i, d;
>
> +       if (o->fn_cache_tree)
> +               return o->fn_cache_tree(pos, nr_entries, nr_names, names, info);
> +
>         if (!o->merge)
>                 BUG("We need cache-tree to do this optimization");
>
> then you can add, say threeway_cache_tree_merge(), that does what
> traverse_by_cache_tree() does but more efficient. This involves a lot
> more work (mostly staring and those n-merge functions and making sure
> you don't set the right conditions before going the fast path).
>
> I didn't do it because.. well.. it's more work and also riskier. I
> think we can leave that for later, unless you think we should do it
> now.
>
> [1] https://public-inbox.org/git/20180726163049.GA15572@xxxxxxxxxxxxxx/

Yeah, from your other thread, I think I was missing some of the
intracacies of how the cache-tree works and the extra work that'd be
needed to bring it along. Deferring until later makes sense.