Re: [PATCH 1/8] revert: Improve error handling by cascading errors upwards

Jonathan Nieder <jrnieder@xxxxxxxxx> · Wed, 11 May 2011 04:59:49 -0500

Hi,

Ramkumar Ramachandra wrote:
[reordered for convenience]

> --- a/advice.c
> +++ b/advice.c
> @@ -47,3 +47,17 @@ void NORETURN die_resolve_conflict(const char *me)
>  	else
>  		die("'%s' is not possible because you have unmerged files.", me);
>  }
> +
> +int error_resolve_conflict(const char *me)
> +{
> +	if (advice_resolve_conflict)
> +		/*
> +		 * Message used both when 'git commit' fails and when
> +		 * other commands doing a merge do.
> +		 */
> +		return error("'%s' is not possible because you have unmerged files.\n"
> +			"Please, fix them up in the work tree, and then use 'git add/rm <file>' as\n"
> +			"appropriate to mark resolution and make a commit, or use 'git commit -a'.", me);
> +	else
> +		return error("'%s' is not possible because you have unmerged files.", me);
> +}

Would it make sense to do

 void NORETURN die_resolve_conflict(const char *me)
 {
	error_resolve_conflict(me);
	exit(128);
 }

or is the s/fatal/error/ in output too much?  I would suspect
saying "error:" instead of "fatal:" should be okay if it means having
less code to deal with, but if not, maybe there is some other way to
achieve the appropriate effect (e.g. --- please donât use this; itâs
just an example --- when a global in usage.c about_to_die is true,
changing "error:" to "fatal:" or something like that).

> As a general
> guideline for libification, "die" should be only be called in two
> cases: by toplevel callers like command-line argument parsing
> routines, or when an irrecoverable situation is encountered.

The above hints at to a downside to this principle: currently the
message printed by "die" is clearly labelled as gitâs cause of death.
Still, I think the principle is right and that itâs worth it, and that
generally speaking a library function should not have to know whether
its errors would be an appropriate time to exit or this is a GUI that
will want to stay alive.

>  Has the commit message justified this change fully?

It didnât mention "GUI", so I guess no. :)

Could you give a reminder about how this fits in the sequencer series?
Is it that you want to give advice to the user about how to recover
when exiting (and if so, could an atexit handler work instead for
that)?

I'm reviewing the rest because I like the aforementioned "GUI" use
case, but if you get tired of the exercise, please remember that it is
not self-evidently necessary to continue along this track for the sake
of the sequencer alone.

> --- a/builtin/revert.c
> +++ b/builtin/revert.c
> @@ -167,9 +167,11 @@ static char *get_encoding(const char *message)
>  {
>  	const char *p = message, *eol;
>  
> -	if (!p)
> -		die (_("Could not read commit message of %s"),
> -				sha1_to_hex(commit->object.sha1));
> +	if (!p) {
> +		error(_("Could not read commit message of %s"),
> +			sha1_to_hex(commit->object.sha1));
> +		return NULL;

The only caller to get_encoding is get_message, which already returns
early when not passed a usable message.  So maybe this should say

	assert(p);

or

	if (!p)
		die("BUG: get_message caller forgot to give a commit message");

> --- a/builtin/revert.c
> +++ b/builtin/revert.c
> @@ -198,7 +200,7 @@ static void add_message_to_msg(struct strbuf *msgbuf, const char *message)
>  	strbuf_addstr(msgbuf, p);
>  }
>  
> -static void write_cherry_pick_head(void)
> +static int write_cherry_pick_head(void)

write_cherry_pick_head gets called by do_pick_commit to help a person
pick up where "git cherry-pick" left off if it has to exit when
resolving conflicts.  So itâs a good example of how to avoid having
to clean up after code when it fails. :)

Anyway, the question here is, what should happen if
write_cherry_pick_head fails (for example due to EPERM or ENOSPC)?  If
this is a GUI or another program that has to stay alive, I suppose one
would want to report the error and back out the cherry-pick of the
current commit as far as it has proceeded (meaning freeing some
memory).  From the signature above I would expect that it does

	return error(...);

on failure (which would bring up an explanation of the failure under
the tab bar in my GUI).

>  {
>  	int fd;
>  	struct strbuf buf = STRBUF_INIT;
> @@ -206,12 +208,22 @@ static void write_cherry_pick_head(void)
>  	strbuf_addf(&buf, "%s\n", sha1_to_hex(commit->object.sha1));
>  
>  	fd = open(git_path("CHERRY_PICK_HEAD"), O_WRONLY | O_CREAT, 0666);
> -	if (fd < 0)
> -		die_errno(_("Could not open '%s' for writing"),
> -			  git_path("CHERRY_PICK_HEAD"));
> +	if (fd < 0) {
> +		int err = errno;
> +		strbuf_release(&buf);
> +		error(_("Could not open '%s' for writing: %s"),
> +			git_path("CHERRY_PICK_HEAD"), strerror(err));
> +		return -err;
> +	}

Why does the caller care about errno (i.e., why "return -errno"
instead of "return error(...)")?

> -	if (write_in_full(fd, buf.buf, buf.len) != buf.len || close(fd))
> -		die_errno(_("Could not write to '%s'"), git_path("CHERRY_PICK_HEAD"));
> +	if (write_in_full(fd, buf.buf, buf.len) != buf.len || close(fd)) {
> +		int err = errno;
> +		strbuf_release(&buf);
> +		error(_("Could not write to '%s': %s"),
> +			git_path("CHERRY_PICK_HEAD"), strerror(err));
> +		return -err;
> +	}

In the write_in_full case, missing "close(fd)".

In the close(fd) case, not missing "close(fd)", unless we want to
check for errno == EINTR and handle that.  (If we do, though, it would
probably be in a separate patch, by introducing an xclose function
analagous to xwrite.)  So I suppose this should look something like

	if (write_in_full(...)) {
		ret = error(_("Could not write...
		goto done3;
	}
	if (close(fd)) {
		ret = error(_(...
		goto done2;
	}
	goto done;

 done3:
	close(fd);
 done2:
	unlink_or_warn(git_path("CHERRY_PICK_HEAD"));
 done:
	strbuf_release(&buf);
	return ret;
 }

> @@ -243,17 +255,22 @@ static void print_advice(void)
>  	advise("and commit the result with 'git commit'");
>  }
>  
> -static void write_message(struct strbuf *msgbuf, const char *filename)
> +static int write_message(struct strbuf *msgbuf, const char *filename)

write_message writes the MERGE_MSG file.  Like get_cherry_pick_head,
its purpose is to allow recovery if the cherry-pick runs into
conflicts, and presumably the appropriate way to recover would be to
back out the cherry-pick (meaning to free buffers used so far and
remove the CHERRY_PICK_HEAD file).

>  {
>  	static struct lock_file msg_file;
>  
>  	int msg_fd = hold_lock_file_for_update(&msg_file, filename,
>  					       LOCK_DIE_ON_ERROR);
> -	if (write_in_full(msg_fd, msgbuf->buf, msgbuf->len) < 0)
> -		die_errno(_("Could not write to %s."), filename);
> +	if (write_in_full(msg_fd, msgbuf->buf, msgbuf->len) < 0) {
> +		int err = errno;
> +		strbuf_release(msgbuf);
> +		error(_("Could not write to %s: %s"), filename, strerror(err));
> +		return -err;
> +	}

Same question as before: why does the caller care about the nature of
the error?

hold_lock_file will die on error unless we declare we want to handle
that ourselves through its flags word.  If we are not about to exit,
we will also want to roll back the lockfile on errors, too.

>  	strbuf_release(msgbuf);
>  	if (commit_lock_file(&msg_file) < 0)
> -		die(_("Error wrapping up %s"), filename);
> +		return error(_("Error wrapping up %s"), filename);

If we use the "return -errno" convention, that would mean "return
-EPERM" on traditional Unixen.

[...]
> +static int error_dirty_worktree(const char *me)
> +{
> +	if (advice_commit_before_merge)
> +		return error(_("Your local changes would be overwritten by %s.\n"
> +				"Please, commit your changes or stash them to proceed."), me);
> +	return error(_("Your local changes would be overwritten by %s.\n"), me);
> +}

Side note: git still doesnât have a good API for dealing with advice.
I think Iâd prefer

	error(_("Your local changes would be overwritten by %s.\n"), me);
	if (advice_commit_before_merge)
		advise(_("Please, commit your...

or perhaps that last part would be

	advise(advice_commit_before_merge,
	       _("Please,...

which would write

 fatal: your local changes would be overwritten by the cherry-pick
 hint: please, commit your changes or stash them to proceed

> -static NORETURN void die_dirty_index(const char *me)
> -{
> -	if (read_cache_unmerged()) {
> -		die_resolve_conflict(me);
> -	} else {
> -		if (advice_commit_before_merge) {
> -			if (action == REVERT)
> -				die(_("Your local changes would be overwritten by revert.\n"
> -					  "Please, commit your changes or stash them to proceed."));
[...]
> +static int verify_resolution(const char *me)
> +{
> +	if (!read_cache_unmerged())
> +		return 0;
> +
> +	return error_resolve_conflict(me);
> +}
> +

How is this function meant to be used?  die_dirty_index would
presumably become something like

	if (verify_resolution(me))
		return -1;
	return error_dirty_worktree(me);

but that is not what your caller for this does.  Will say more when I
get to it.

[...]
> @@ -329,10 +341,12 @@ static int do_recursive_merge(struct commit *base, struct commit *next,
>  			    next_tree, base_tree, &result);
>  
>  	if (active_cache_changed &&
> -	    (write_cache(index_fd, active_cache, active_nr) ||
> -	     commit_locked_index(&index_lock)))
> +		(write_cache(index_fd, active_cache, active_nr) ||
> +			commit_locked_index(&index_lock))) {

Whitespace changes seem to have snuck in.

> +		rollback_lock_file(&index_lock);

Makes sense.

>  		/* TRANSLATORS: %s will be "revert" or "cherry-pick" */
> -		die(_("%s: Unable to write new index file"), me);
> +		return error(_("%s: Unable to write new index file"), me);
> +	}
>  	rollback_lock_file(&index_lock);

merge_recursive can die, too (e.g., "error building trees").

[...]
> @@ -397,19 +411,21 @@ static int do_pick_commit(void)

do_pick_commit is the main worker of "git cherry-pick foo..bar".  When
it returned an error, traditionally that meant it had encountered a
conflict, and the return value would what exit status to use to
indicate that (typically 1, one hopes, but perhaps 139 if the merge
strategy segfaulted, etc).

To recover, I suppose I can imagine a caller wanting to "try harder".
At any rate the CHERRY_PICK_HEAD, MERGE_HEAD, on-disk dirty index, and
so on are good state that should be kept.

>  		 * that represents the "current" state for merge-recursive
>  		 * to work on.
>  		 */
> -		if (write_cache_as_tree(head, 0, NULL))
> -			die (_("Your index file is unmerged."));
> +		if (write_cache_as_tree(head, 0, NULL)) {
> +			discard_cache();

I suppose this is cleaning up after the read_cache call in
write_cache_as_tree?  But the above does not back out the index lock
at the same time, so I donât find it so clear.  I would prefer to
see write_cache_as_tree taught to clean up after itself on error.

[...]
> @@ -418,19 +434,19 @@ static int do_pick_commit(void)
>  		struct commit_list *p;
>  
>  		if (!mainline)
> -			die(_("Commit %s is a merge but no -m option was given."),
> -			    sha1_to_hex(commit->object.sha1));
> +			return error(_("Commit %s is a merge but no -m option was given."),
> +				sha1_to_hex(commit->object.sha1));

This is not a conflict but an error in usage.  Stepping back for a
second, what is the best way to handle that?  The aforementioned
hypothetical caller considering "try an alternate strategy" would make
a wrong move, so we need to use a return value that would dissuade it.

Iâm not sure what the most appropriate thing to do is.  Maybe

	/*
	 * Positive numbers are exit status from conflicts; negative
	 * numbers are other errors.
	 */
	enum pick_commit_error {
		PICK_COMMIT_USAGE_ERROR = -1
	};

but itâs hard to think clearly about it since it seems too
hypothetical to me.  If all callers are going to exit, then

	error(...);
	return 129;

will work, but in that case why not exit for them?

>  
>  		for (cnt = 1, p = commit->parents;
>  		     cnt != mainline && p;
>  		     cnt++)
>  			p = p->next;
>  		if (cnt != mainline || !p)
> -			die(_("Commit %s does not have parent %d"),
> +			return error(_("Commit %s does not have parent %d"),
>  			    sha1_to_hex(commit->object.sha1), mainline);

Likewise.

>  		parent = p->item;
> -	} else if (0 < mainline)
> -		die(_("Mainline was specified but commit %s is not a merge."),
> +	} else if (mainline > 0)
> +		return error(_("Mainline was specified but commit %s is not a merge."),
>  		    sha1_to_hex(commit->object.sha1));

Likewise.

>  	else
>  		parent = commit->parents->item;
> @@ -441,11 +457,11 @@ static int do_pick_commit(void)
>  	if (parent && parse_commit(parent) < 0)
>  		/* TRANSLATORS: The first %s will be "revert" or
>  		   "cherry-pick", the second %s a SHA1 */
> -		die(_("%s: cannot parse parent commit %s"),
> +		return error(_("%s: cannot parse parent commit %s"),
>  		    me, sha1_to_hex(parent->object.sha1));
>  
>  	if (get_message(commit->buffer, &msg) != 0)
> -		die(_("Cannot get commit message for %s"),
> +		return error(_("Cannot get commit message for %s"),
>  				sha1_to_hex(commit->object.sha1));
>  
>  	/*
> @@ -484,7 +500,11 @@ static int do_pick_commit(void)
>  			strbuf_addstr(&msgbuf, ")\n");
>  		}
>  		if (!no_commit)
> -			write_cherry_pick_head();
> +			if ((res = write_cherry_pick_head())) {
> +				free_message(&msg);
> +				free(defmsg);
> +				return res;
> +			}

These need to eventually exit with status 128.

> @@ -524,44 +544,46 @@ static int do_pick_commit(void)
>  	return res;
>  }
>  
> -static void prepare_revs(struct rev_info *revs)
> +static int prepare_revs(struct rev_info *revs)

prepare_revs is used by the cherry-pick porcelain after parsing
arguments to initialize the revision walker.  If it fails, we can
save some trouble and cancel the whole operation.

>  {
> -	int argc;
> -
>  	init_revisions(revs, NULL);
>  	revs->no_walk = 1;
>  	if (action != REVERT)
>  		revs->reverse = 1;
>  
> -	argc = setup_revisions(commit_argc, commit_argv, revs, NULL);
> -	if (argc > 1)
> -		usage(*revert_or_cherry_pick_usage());
> +	if (setup_revisions(commit_argc, commit_argv, revs, NULL) > 1)
> +		return error(_("usage: %s"), *revert_or_cherry_pick_usage());

The exit status for errors in usage ought to be 129, and the message

	error: usage: ...

is not so pretty. :)

I think a library version of this would want to take a struct rev_info
that has been prepopulated.

[...]
> -static void read_and_refresh_cache(const char *me)
> +static int read_and_refresh_cache(const char *me)
>  {
>  	static struct lock_file index_lock;
>  	int index_fd = hold_locked_index(&index_lock, 0);
>  	if (read_index_preload(&the_index, NULL) < 0)
> -		die(_("git %s: failed to read the index"), me);
> +		return error(_("%s: failed to read the index"), me);

What is the caller expecting to happen when this fails?  Do we
want to roll back the lockfile and discard the index?

>  	refresh_index(&the_index, REFRESH_QUIET|REFRESH_UNMERGED, NULL, NULL, NULL);
>  	if (the_index.cache_changed) {
>  		if (write_index(&the_index, index_fd) ||
> -		    commit_locked_index(&index_lock))
> +			commit_locked_index(&index_lock)) {

Whitespace change snuck in.

> -			die(_("git %s: failed to refresh the index"), me);
> +			rollback_lock_file(&index_lock);
> +			return error(_("%s: failed to refresh the index"), me);

Likewise.  (Should this discard the index?  If so, why?  If not, why
not?)

> +		}
>  	}
>  	rollback_lock_file(&index_lock);
> +	return 0;
>  }
>  
>  static int revert_or_cherry_pick(int argc, const char **argv)

The porcelain itself, but the sequencer wants to call it.

>  {
>  	struct rev_info revs;
> +	int res;
>  
>  	git_config(git_default_config, NULL);
>  	me = action == REVERT ? "revert" : "cherry-pick";
> @@ -579,17 +601,15 @@ static int revert_or_cherry_pick(int argc, const char **argv)
>  			die(_("cherry-pick --ff cannot be used with --edit"));
>  	}
>  
> -	read_and_refresh_cache(me);
> -	prepare_revs(&revs);
> -
> +	if ((res = read_and_refresh_cache(me)) ||
> +		(res = prepare_revs(&revs)))
> +		return res;

Clearer to write:

	if (read_and_refresh_cache(me) ||
	    prepare_revs(&revs))
		return -1;

and it has the nice side-effect of making it clear to callers that
they donât have to worry about return value conventions from those
functions.

> -	while ((commit = get_revision(&revs))) {
> -		int res = do_pick_commit();
> -		if (res)
> -			return res;
> -	}
> -
> -	return 0;
> +	while ((commit = get_revision(&revs)) &&
> +		!(res = do_pick_commit()))
> +		;
> +
> +	return res;

I think the original is clearer here.

>  }
>
[...]
>  How do I trap and handle the exit status from write_message in
>  do_pick_commit correctly?  There's already a call to
>  do_recursive_merge whose exit status is being trapped -- what happens
>  when do_recursive_merge succeeds and write_message fails, or
>  viceversa?

Good question.  See my confused "enum" ramble above.

>  Junio has suggested dropping error_errno, and simply using error and
>  returning -errno by hand in one email.  Considering the number of
>  times I've used that tecnique, and I think we should get something
>  like an error_errno atleast for the sake of terseness.

I do think an error_errno that unconditionally returns -1 (analagous
to die_errno) would be a nice thing to have, even though it wouldnât
make anything much shorter. :)

*whew*

Thanks and hope that helps,
Jonathan
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html