Re: [PATCH 11/11] Resumable clone: implement primer logic in git-clone

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Sep 16, 2016 at 04:32:05PM -0700, Junio C Hamano wrote:
> >  -----------
> > @@ -172,6 +173,12 @@ objects from the source repository into a pack in the cloned repository.
> >  	via ssh, this specifies a non-default path for the command
> >  	run on the other end.
> >  
> > +--prime-clone <prime-clone>::
> > +-p <prime-clone>::
> 
> Not many other options have single letter shorthand.  Is it expected
> that it is worth to let this option squat on a short-and-sweet "-p",
> perhaps because it is so frequently used?

I based my decision more on precedent than value--the "--upload-pack" option
had the shorthand "-u".

> > @@ -40,17 +42,20 @@ static const char * const builtin_clone_usage[] = {
> >  
> >  static int option_no_checkout, option_bare, option_mirror, option_single_branch = -1;
> >  static int option_local = -1, option_no_hardlinks, option_shared, option_recursive;
> > +static int option_resume;
> >  static char *option_template, *option_depth;
> > -static char *option_origin = NULL;
> > +static const char *option_origin = NULL;
> 
> Is this change related to anything you are doing here?
> 
> If you are fixing things while at it, please don't ;-) If you really
> want to, please also remove " = NULL", from this line and also from
> the next line.  Also do not add " = NULL" at the end of alt_res.
> 

Yes, it was because remote_config uses all const char* strings, and .name gets
assigned to option_origin when it pulls the previous info from config. I looked
to see if option_origin's underlying string is ever modified in a way that
makes it ineligible to be const, and it wasn't.

So, I should remove the NULLs here? And then initalize alt_res to NULL in
cmd_clone?

> >  static char *option_branch = NULL;
> >  ...
> > +static const struct alt_resource *alt_res = NULL;
> 
> > +static char *get_filename(const char *dir)
> > +{
> > +	char *dir_copy = xstrdup(dir);
> > +	strip_trailing_slashes(dir_copy);
> > +	char *filename, *final = NULL;
> > +
> > +	filename = find_last_dir_sep(dir);
> > +
> > +	if (filename && *(++filename))
> > +		final = xstrdup(filename);
> > +
> > +	free(dir_copy);
> > +	return final;
> > +}
> 
> Hmph, don't we have our own basename(3) lookalike that knows about
> dir-sep already?

Whoops, did not catch that. Thanks!

> > @@ -562,7 +614,7 @@ static void write_remote_refs(const struct ref *local_refs)
> >  		die("%s", err.buf);
> >  
> >  	for (r = local_refs; r; r = r->next) {
> > -		if (!r->peer_ref)
> > +		if (!r->peer_ref || ref_exists(r->peer_ref->name))
> >  			continue;
> >  		if (ref_transaction_create(t, r->peer_ref->name, r->old_oid.hash,
> >  					   0, NULL, &err))
> 
> What is this change about?

Because resumable clone is supposed to handle halting, I included the
possibility of halting after the final fetch happened and remote refs were
updated (basically anything after resumable download happens). This change
basically says "If the remote ref was not previously written, then write it."
Because the write is all-or-nothing, this means the logic should either write
all of the intended references, or none of them (because they were written
in the previous invocation).

> > +static const char *setup_and_index_pack(const char *filename)
> > +{
> > +	const char *primer_idx_path = NULL, *primer_bndl_path = NULL;
> > +	primer_idx_path = replace_extension(filename, ".pack", ".idx");
> > +	primer_bndl_path = replace_extension(filename, ".pack", ".bndl");
> > +
> > +	if (!(primer_idx_path && primer_bndl_path)) {
> > +		warning("invalid pack filename '%s', falling back to full "
> > +			"clone", filename);
> > +		return NULL;
> > +	}
> > +
> > +	if (!file_exists(primer_bndl_path)) {
> > +		if (do_index_pack(filename, primer_idx_path)) {
> > +			warning("could not index primer pack, falling back to "
> > +				"full clone");
> > +			return NULL;
> > +		}
> > +	}
> 
> Can it be another (undetected) failure mode that .bndl somehow
> already existed, but not .idx, leaving the resulting object store in
> an incosistent state?  Can do_index_pack() fail and leave .bndl
> behind to get you into such a state?

I don't think so, based on looking at builtin/index-pack.c. write_bundle_file()
happens at the very end of final(), which is long after the process created the
resulting .idx, or has died in the event one is not created. There also is no
automatic cleanup of .idx after that if the process dies somehow after .bndl
is written.

> > +static int write_bundle_refs(const char *bundle_filename)
> > +{
> > +	struct ref_transaction *t;
> > +	struct bundle_header history_tips;
> > +	const char *temp_ref_base = "resume";
> > +	struct strbuf err = STRBUF_INIT;
> > +	int i;
> > +
> > +	init_bundle_header(&history_tips, bundle_filename);
> > +	read_bundle_header(&history_tips);
> > +
> > +	t = ref_transaction_begin(&err);
> > +	for (i = 0; i < history_tips.references.nr; i++) {
> > +		struct strbuf ref_name = STRBUF_INIT;
> > +		strbuf_addf(&ref_name, "refs/temp/%s/%s/temp-%s",
> > +			    option_origin, temp_ref_base,
> > +			    sha1_to_hex(history_tips.references.list[i].sha1));
> 
> Can we do this without polluting refs/temp/ namespace?
> 
> I am imagining that you are first fetching the .pack file from
> sideways when primer service is available, running index-pack on it
> to produce the bundle, and the step after that is to run "git fetch"
> against the original remote to fill the gap between the bit-stale
> history you got in the bundle and the reality that has progressed
> since the primer pack was made, and you need a way to tell to the
> other end that you already have the history leading to these refs
> when you run "git fetch".  I think a bit better way to do so is to
> send these has ".have" while you run the "fetch".
> 
> Wouldn't it do if you add the "--advertise-bundle-tips=<bndl>"
> option to "git fetch", move the code to read the bundle header to
> it, and point the bundle's filename with the option when you spawn
> "git fetch"?

My implementation is definitely a minimum working model. I was hoping to
move to something cleaner.

Is this new option preferable to leveraging the "--reference" option you
mentioned in the earlier discussion? I thought that was a clean solution,
especially because it uses an existing option. Additionally, would there be
any use for this new fetch option outside of cloning? If so, I could see the
value--otherwise, knowing that we want to keep as much resume-specific
knowledge inside clone as possible, "--reference" with a .bndl seems better.

- Kevin



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]