Re: [PATCH 1/3] cache-tree: Create/update cache-tree on checkout

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 2014-07-01 at 13:15 -0700, Junio C Hamano wrote:
> David Turner <dturner@xxxxxxxxxxxxxxxx> writes:
> 
> > When git checkout checks out a branch, create or update the
> > cache-tree so that subsequent operations are faster.
> >
> > Signed-off-by: David Turner <dturner@xxxxxxxxxxx>
> > ---
> >  builtin/checkout.c    |  8 ++++++++
> >  cache-tree.c          |  5 +++--
> >  t/t0090-cache-tree.sh | 15 ++++++++++++++-
> >  3 files changed, 25 insertions(+), 3 deletions(-)
> >
> > diff --git a/builtin/checkout.c b/builtin/checkout.c
> > index 07cf555..a023a86 100644
> > --- a/builtin/checkout.c
> > +++ b/builtin/checkout.c
> > @@ -553,6 +553,14 @@ static int merge_working_tree(const struct checkout_opts *opts,
> >  		}
> >  	}
> >  
> > +	if (!active_cache_tree)
> > +		active_cache_tree = cache_tree();
> > +
> > +	if (!cache_tree_fully_valid(active_cache_tree))
> > +		cache_tree_update(active_cache_tree,
> > +				  (const struct cache_entry * const *)active_cache,
> > +				  active_nr, 0);
> > +
> 
> This looks much better than the previous round, but it still does
> allow verify_cache() to throw noises against unmerged entries in the
> cache, as WRITE_TREE_SILENT flag is not passed down, no?
> 
> 	$ git checkout master^0
>         $ git am $this_message
>         $ make
>         $ edit builtin/checkout.c ;# make changes to the above lines
>         $ ./git checkout -m master^0
> x       builtin/checkout.c: unmerged (972c8a7b28f16f88475268f9a714048c228e69db)
> x       builtin/checkout.c: unmerged (f1dc56e55f7b2200412142b10517458ccfda2952)
> x       builtin/checkout.c: unmerged (3b9753ba8c19e7dfe6e922f30eb85c83a92a4596)
>         M       builtin/checkout.c
>         Warning: you are leaving 1 commit behind, not connected to
>         any of your branches:
> 
>           25fab54 cache-tree: Create/update cache-tree on checkout
> 
>         Switched to branch 'master'
> 
> Passing WRITE_TREE_SILENT in the flags parameter will get rid of the
> conflict notice output from the above.
> 
> The user is not interested in writing a brand new tree object at all
> in this case, so it feels wrong to actually let the call chain go
> down to update_one() and create new tree objects.
> 
> 	Side note.  And passing WRITE_TREE_DRY_RUN is not a good
> 	solution either, because a later write_cache_as_tree() will
> 	not create the necessary tree object once you stuff a tree
> 	object name in the cache-tree.
> 
> What we want in this code path is a way to repair a sub cache_tree
> if it can be repaired without creating a new tree object and
> otherwise leave that part invalid.  The existing cache-tree
> framework is not prepared to do that kind of thing.  It wants to
> start from the bottom and percolate things up, computing levels
> nearer to the top-level only after it fully created the trees for
> deeper levels, because it is meant to be used only when we really
> want to write out trees.  We may want to reuse update_one() but
> 
> I am not convinced that doing an equivalent of write-tree when you
> switch branches is the right approach in the first place.  You will
> eventually write it out as a tree, and having a relatively undamaged
> cache-tree will help you when you do so, but spending the cycles
> necessary to compute a fully populated cache-tree, only to let it
> degrade over time until you are ready to write it out as a tree,
> somehow sounds like asking for a duplicated work upfront.

As I understand it, the cache-tree extension was originally designed to
speed up writing the tree later.  However, as Karsten Blees's work (and
my own tests) have shown, it also speeds up git status.  I use git
status a lot while working, and I've talked to a few others who do the
same.  So I think it's worth spending extra time when switching branches
to have a good working experience within that branch.

In the new version of the patchset (which I'll post shortly), I've added
an option WRITE_TREE_REPAIR, which does all of the work to compute a new
tree object, but only adds it to the cache-tree if it already exists
on-disk.  This is a little wasteful for the reason that you note.  So if
you would like, I could add a config option to skip it.  But I think it
is a good default.

Does this seem OK to you, assuming the implementation is good? 

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]