Re: [PATCH 2/3] prune: fix pruning with multiple worktrees and split index

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 12/10, Thomas Gummerer wrote:
> be489d02d2 ("revision.c: --indexed-objects add objects from all
> worktrees", 2017-08-23) made sure that pruning takes objects from all
> worktrees into account.
> 
> It did that by reading the index of every worktree and adding the
> necessary index objects to the set of pending objects.  The index is
> read by read_index_from.  As mentioned in the previous commit,
> read_index_from depends on the CWD for the location of the split index,
> and add_index_objects_to_pending doesn't set that before using
> read_index_from.
> 
> Instead of using read_index_from, use repo_read_index, which is aware of
> the proper paths for the worktree.
> 
> This fixes t5304-prune when ran with GIT_TEST_SPLIT_INDEX set.
> 

I'm on the fence about this change.  I understand that this will ensure
that the proper objects aren't pruned when using a split index in the
presence of worktrees but I think the solution needs to be thought
through a bit more.

My big concern right now is the interaction of 'struct worktree's and
'struct repository'.  I'll try to highlight my concerns below.

> Signed-off-by: Thomas Gummerer <t.gummerer@xxxxxxxxx>
> ---
> 
> This also fixes t7009 when ran with GIT_TEST_SPLIT_INDEX.  I'm not
> quite sure why it is fixed by this.  Either way I tracked the failure
> down to f767178a5a ("Merge branch 'jk/no-null-sha1-in-cache-tree'",
> 2017-05-16).  Maybe Peff has an idea why this fixes that test?
> 
>  repository.c | 11 +++++++++++
>  repository.h |  2 ++
>  revision.c   | 13 ++++++++-----
>  3 files changed, 21 insertions(+), 5 deletions(-)
> 
> diff --git a/repository.c b/repository.c
> index 928b1f553d..3c9bfbd1b8 100644
> --- a/repository.c
> +++ b/repository.c
> @@ -2,6 +2,7 @@
>  #include "repository.h"
>  #include "config.h"
>  #include "submodule-config.h"
> +#include "worktree.h"
>  
>  /* The main repository */
>  static struct repository the_repo = {
> @@ -146,6 +147,16 @@ int repo_init(struct repository *repo, const char *gitdir, const char *worktree)
>  	return -1;
>  }
>  
> +/*
> + * Initialize 'repo' based on the provided worktree
> + * Return 0 upon success and a non-zero value upon failure.
> + */
> +int repo_worktree_init(struct repository *repo, struct worktree *worktree)
> +{
> +	return repo_init(repo, get_worktree_git_dir(worktree),
> +			 worktree->path);
> +}

My first concern is the use of 'get_worktree_git_dir()'.  Under the hood
it calls 'get_git_dir()', 'get_git_common_dir()', and
'git_common_path()' which rely on global state as stored in
'the_repository'.  So how does one initialize a repository struct (using
this initializer) using a worktree from a repository other than the
global 'the_repository' struct?  I'm not sure I have an answer right
now, but its an issue that needs to be thought through before we head
down this road.

Just thinking to myself, Does it make sense to have worktree's as a
separate struct or to have them stored in 'struct repository' in some
way?  Shouldn't a repository struct have a way to interact with all of
its worktrees?  How would initializing a repository struct for every
worktree work once we migrate the object store to be stored in 'struct
repoisotry'?  Shouldn't every worktree share the same object store
in-memory like they do on-disk?

> +
>  /*
>   * Initialize 'submodule' as the submodule given by 'path' in parent repository
>   * 'superproject'.
> diff --git a/repository.h b/repository.h
> index 7f5e24a0a2..2adeb05bf4 100644
> --- a/repository.h
> +++ b/repository.h
> @@ -4,6 +4,7 @@
>  struct config_set;
>  struct index_state;
>  struct submodule_cache;
> +struct worktree;
>  
>  struct repository {
>  	/* Environment */
> @@ -87,6 +88,7 @@ extern struct repository *the_repository;
>  extern void repo_set_gitdir(struct repository *repo, const char *path);
>  extern void repo_set_worktree(struct repository *repo, const char *path);
>  extern int repo_init(struct repository *repo, const char *gitdir, const char *worktree);
> +extern int repo_worktree_init(struct repository *repo, struct worktree *worktree);
>  extern int repo_submodule_init(struct repository *submodule,
>  			       struct repository *superproject,
>  			       const char *path);
> diff --git a/revision.c b/revision.c
> index e2e691dd5a..9d8d9b96d1 100644
> --- a/revision.c
> +++ b/revision.c
> @@ -22,6 +22,7 @@
>  #include "packfile.h"
>  #include "worktree.h"
>  #include "argv-array.h"
> +#include "repository.h"
>  
>  volatile show_early_output_fn_t show_early_output;
>  
> @@ -1346,15 +1347,17 @@ void add_index_objects_to_pending(struct rev_info *revs, unsigned int flags)
>  	worktrees = get_worktrees(0);
>  	for (p = worktrees; *p; p++) {
>  		struct worktree *wt = *p;
> -		struct index_state istate = { NULL };
> +		struct repository *repo;
>  
> +		repo = xmalloc(sizeof(struct repository));

This was allocated but never freed, was that intentional?

>  		if (wt->is_current)
>  			continue; /* current index already taken care of */
> +		if (repo_worktree_init(repo, wt))
> +			BUG("couldn't initialize repository object from worktree");
>  
> -		if (read_index_from(&istate,
> -				    worktree_git_path(wt, "index")) > 0)
> -			do_add_index_objects_to_pending(revs, &istate);
> -		discard_index(&istate);
> +		if (repo_read_index(repo) > 0)
> +			do_add_index_objects_to_pending(revs, repo->index);
> +		discard_index(repo->index);

One we have separate object stores per-repository how would we handle
this since this pruning should only work on a single repository's object
store?

>  	}
>  	free_worktrees(worktrees);
>  }
> -- 
> 2.15.1.504.g5279b80103
> 

-- 
Brandon Williams



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux