Re: [PATCH 6/8] revert: Introduce head, todo, done files to persist state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

Jonathan Nieder writes:
> Ramkumar Ramachandra wrote:
> 
> > A cherry-pick/ revert operation consists of several smaller steps.
> > Later in the series, we would like to be able to resume a failed
> > operation.
> 
> When introducing jargon, it is hard to make the intent perfectly
> clear.  I suppose what this means is:
> 
>  Ever since v1.7.2-rc1~4^2~7 (revert: allow cherry-picking more than
>  one commit, 2010-06-02), a single invocation of "git cherry-pick"
>  or "git revert" can perform picks of several individual commits.  To
>  allow "git cherry-pick --abort" to cancel and "git cherry-pick
>  --continue" to resume the entire command, we will need to store some
>  information about the state and the plan at the beginning.

Thanks for digging that up for me.  I'll try to reword it in the next
iteration without introducing more jargon, if that's preferred.

> > Introduce a "head" file to make note of the HEAD when
> > the operation stated (so that the operation can be aborted), a "todo"
> > file to keep the list of the steps to be performed, and a "done" file
> > to keep a list of steps that have completed successfully.  The format
> > of these files is similar to the one used by the "rebase -i" process.
> 
> s/stated/started/ :)  Makes some sense, aside from that.
> 
> It would be more conventional to use all-caps symref-like names, like
> MULTIPLE_CHERRY_PICK_ORIG_HEAD, CHERRY_PICK_TODO, and
> CHERRY_PICK_DONE, or to put these files in a subdirectory (oh, they're
> already in a subdirectory?  Why didn't you mention that? :)).

I'll mention that in the commit message next time.

> By the way, what is .git/sequencer/done used for?

I don't know :p
I just added it because "rebase -i" uses it too, although I can't find
a definite usecase for it.

> > --- a/builtin/revert.c
> > +++ b/builtin/revert.c
> > @@ -25,6 +26,13 @@
> >   * Copyright (c) 2005 Junio C Hamano
> >   */
> >  
> > +#define SEQ_DIR "sequencer"
> > +
> > +#define SEQ_PATH	git_path(SEQ_DIR)
> > +#define HEAD_FILE	git_path(SEQ_DIR "/head")
> > +#define TODO_FILE	git_path(SEQ_DIR "/todo")
> > +#define DONE_FILE	git_path(SEQ_DIR "/done")
> 
> These seeming constants that call a function are kind of scary.

Uh, you'd prefer seeing it literally spelt out over and over again?

> > @@ -629,21 +637,118 @@ static int read_and_refresh_cache(struct replay_opts *opts)
> >  	return 0;
> >  }
> >  
> > +static int format_todo(struct strbuf *buf, struct commit_list *list,
> > +			struct replay_opts *opts)
> > +{
> > +	struct commit_list *cur = NULL;
> > +	struct commit_message msg = { NULL, NULL, NULL, NULL, NULL };
> > +	const char *sha1 = NULL;
> > +	const char *action;
> > +
> > +	action = (opts->action == REVERT ? "revert" : "pick");
> > +	for (cur = list; cur; cur = cur->next) {
> > +		sha1 = find_unique_abbrev(cur->item->object.sha1, DEFAULT_ABBREV);
> > +		if (get_message(cur->item, cur->item->buffer, &msg))
> > +			return error(_("Cannot get commit message for %s"), sha1);
> > +		strbuf_addf(buf, "%s %s %s\n", action, sha1, msg.subject);
> 
> Is this internal state or for the user?  If it is internal state, I'd
> naïvely have expected a sequence of 40-character hexadecimal lines,
> perhaps with human-readable names like "topic~3" for the sake of
> error messages if git knows about them.

For the user.  This is the instruction sheet that the user will be
able to edit at a later stage, when we develop something like
"sequencer -i".

> > +static int persist_initialize(unsigned char *head)
> > +{
> > +	struct strbuf buf = STRBUF_INIT;
> > +	int fd;
> > +
> > +	if (!file_exists(SEQ_PATH) && mkdir(SEQ_PATH, 0777)) {
> 
> What if .git/sequencer exists and is a file?  How does this interact
> with "[core] sharedrepository" configuration?  What happens if
> .git/sequencer contains some stale files --- if the power fails while
> git is writing new files in .git/sequencer/, will the state be
> confusing?

Ah, thanks for pointing these out -- I hadn't thought about them earlier.

> > +		int err = errno;
> > +		strbuf_release(&buf);
> > +		error(_("Could not create sequencer directory '%s': %s"),
> > +			SEQ_PATH, strerror(err));
> > +		return -err;
> 
> Why does the caller care about which errno, and what is it going to
> do with that information?

Hm, error_errno seems that consistently returns -1 seems like a good
idea now.  I'll get it back into the series next time.

> > +	}
> > +
> > +	if ((fd = open(HEAD_FILE, O_WRONLY | O_CREAT | O_TRUNC, 0666)) < 0) {
> 
> More idiomatic in the git codebase to write:
> 
> 	fd = open(...);
> 	if (fd < 0) {

Ok.

> > +		int err = errno;
> > +		strbuf_release(&buf);
> > +		error(_("Could not open '%s' for writing: %s"),
> > +			HEAD_FILE, strerror(err));
> > +		return -err;
> 
> As above.  Why does the caller care about errno?  If backing out after
> an error, I suppose it might make sense to rmdir .git/sequencer while
> at it.

Ok.

> > +	}
> > +
> > +	strbuf_addf(&buf, "%s", find_unique_abbrev(head, DEFAULT_ABBREV));
> 
> Why abbreviate?
> 
> > +	write_or_whine(fd, buf.buf, buf.len, HEAD_FILE);
> 
> What happens and should happen on error?



> [...]
> > +static int persist_todo_done(int res, struct commit_list *todo_list,
> > +			struct commit_list *done_list, struct replay_opts *opts)
> 
> This is about recording what has been done and what remains to
> be done?  What does the res argument represent?

Exit status? I have to think back to figure out why I chose to pass it
around like this.

> > +{
> > +	struct strbuf buf = STRBUF_INIT;
> > +	int fd, res2;
> > +
> > +	if (!res)
> > +		return 0;
> > +
> > +	/* TODO file */
> > +	if ((fd = open(TODO_FILE, O_WRONLY | O_CREAT | O_TRUNC, 0666)) < 0) {
> 
> What happens if we are interrupted in the middle of writing this?

I'll use the lockfile API if I have time before the next iteration.

> > +		int err = errno;
> > +		strbuf_release(&buf);
> > +		error(_("Could not open '%s' for writing: %s"),
> > +			TODO_FILE, strerror(err));
> > +		return -err;
> 
> I don't think the caller should care which errno. :)

Right :)

> [...]
> >  static int pick_commits(struct replay_opts *opts)
> >  {
> > +	struct commit_list *done_list = NULL;
> >  	struct rev_info revs;
> >  	struct commit *commit;
> > +	unsigned char head[20];
> >  	int res;
> >  
> > +	if (get_sha1("HEAD", head))
> > +		return error(_("You do not have a valid HEAD"));
> 
> What should happen if I try to cherry-pick onto an unborn branch?  I
> haven't checked what happens.

fatal: You do not have a valid HEAD
I just tried it :)

> > +
> >  	if ((res = read_and_refresh_cache(opts)) ||
> > -		(res = prepare_revs(&revs, opts)))
> > +		(res = prepare_revs(&revs, opts)) ||
> > +		(res = persist_initialize(head)))
> >  		return res;
> >  
> > -	while ((commit = get_revision(&revs)) &&
> > -		!(res = do_pick_commit(commit, opts)))
> > -		;
> > -
> > -	return res;
> > +	while ((commit = get_revision(&revs))) {
> > +		if (!(res = do_pick_commit(commit, opts)))
> > +			commit_list_insert(commit, &done_list);
> 
> This puts done_list in the reverse order that the commits were
> cherry-picked.  Is that the intent?

Yes, although I'm not sure what to do with the done file now (you
pointed that out earlier).

> > +		else {
> > +			commit_list_insert(commit, &revs.commits);
> > +			break;
> > +		}
> > +	}
> > +	return persist_todo_done(res, revs.commits, done_list, opts);
> 
> A few potential trade-offs:
> 
>  - should cherry-pick record the state after every commit?  This would
>    be safe against stray die() calls or segfaults but requires hitting
>    the filesystem which might not be wanted if doing a run of
>    cherry-picks in memory (though git is far from supporting such a
>    "many cherry picks in core followed by checkout and packed
>    collection of objects written to disk all at once" optimization
>    anyway).

Agreed, but I don't think this kind of optimization will be in the
scope of this series.  It's nice to think about though.

>  - should we use O_TRUNC or O_APPEND to modify the state in-place or
>    use separate files and rename them into place?  The latter is
>    safer against sudden exit.

The latter, definitely.  I don't want to have to deal with malformed
instruction sheets.

>  - should we (perhaps optionally) fsync the state when commiting to it?
>    I think no, but someone performing a rebase and running a test suite
>    with the potential to crash the system between commits might appreciate
>    the effort.

Yes, yes! I'd love this feature.

Thanks.

-- Ram
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]