Re: [PATCH 02/31] rebase: refactor reading of state

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, 28 Dec 2010, Junio C Hamano wrote:

> Martin von Zweigbergk <martin.von.zweigbergk@xxxxxxxxx> writes:
> 
> > +read_state () {
> > +	if test -d "$merge_dir"
> > +	then
> > +		state_dir="$merge_dir"
> > +		prev_head=$(cat "$merge_dir"/prev_head) &&
> > +		end=$(cat "$merge_dir"/end) &&
> > +		msgnum=$(cat "$merge_dir"/msgnum)
> > +	else
> > +		state_dir="$apply_dir"
> > +	fi &&
> > +	head_name=$(cat "$state_dir"/head-name) &&
> > +	onto=$(cat "$state_dir"/onto) &&
> > +	orig_head=$(cat "$state_dir"/orig-head) &&
> > +	GIT_QUIET=$(cat "$state_dir"/quiet)
> > +}
> > +
> >  continue_merge () {
> >  	test -n "$prev_head" || die "prev_head must be defined"
> >  	test -d "$merge_dir" || die "$merge_dir directory does not exist"
> > @@ -138,10 +154,6 @@ call_merge () {
> >  }
> >  
> >  move_to_original_branch () {
> > -	test -z "$head_name" &&
> > -		head_name="$(cat "$merge_dir"/head-name)" &&
> > -		onto="$(cat "$merge_dir"/onto)" &&
> > -		orig_head="$(cat "$merge_dir"/orig-head)"
> 
> It used to be safe to call this without head_name and friends set, because
> the function took care of reading these itself.  Nobody calls this without
> head_name set anymore?
> 
> I am not complaining nor suggesting to add an unnecessary "read_state"
> here only to slow things down---I am making sure that you removed this
> because you know it is unnecessary.

Correct. It used to be called without head_name set from
finish_rb_merge, which would in turn be called from the --continue and
--skip arms. onto would already have been set in these cases, but
would then be re-read.

> > @@ -220,13 +232,9 @@ do
> >  			echo "mark them as resolved using git add"
> >  			exit 1
> >  		}
> > +		read_state
> >  		if test -d "$merge_dir"
> >  		then
> > -			prev_head=$(cat "$merge_dir/prev_head")
> > -			end=$(cat "$merge_dir/end")
> > -			msgnum=$(cat "$merge_dir/msgnum")
> > -			onto=$(cat "$merge_dir/onto")
> > -			GIT_QUIET=$(cat "$merge_dir/quiet")
> 
> Even though this patch may make the code shorter, it starts to read
> head_name and orig_head that we previously did not open and change the
> values of variables with what are read from them.  Does this change affect
> the behaviour in any way (either in performance or in correctness)?

True. Previously, head_name and orig_head were lazily read only when
the rebase had been finished (in the finish_rb_merge, as I mentioned
above). Now these values are read unnecessarily when the rebase is
resumed, but a later patch fails. The am-based rebase is not affacted
by this patch as it already read head_name and orig_head eagerly.

Good point about the correctness. I looked into the correctness issue
arising if a file could not be found and I think I concluded that all
of the files would always be written (do we need to care about the
case where a user deletes e.g. .git/rebase-apply/onto?). However, I
did not think about the possibility of overwriting variables. It seems
fine, though, as neither continue_merge nor call_merge uses either of
these variables.

Regarding performance, I would say there is definitely a cost
associated with this patch. How big this cost is, though, I don't dare
to speculate. I will leave that up to the rest of you.

It should be noted that read_state is only called when a rebase is
resumed with --continue or --skip, or when it is aborted. There are no
changes to the code in the call_merge-continue_merge loop.

The performance hit is probably biggest in the --abort case, which
previously only read head_name and orig_head. It now ends up reading
_at least_ two more values.

> 
> > @@ -236,10 +244,6 @@ do
> >  			finish_rb_merge
> >  			exit
> >  		fi
> > -		head_name=$(cat "$apply_dir"/head-name) &&
> > -		onto=$(cat "$apply_dir"/onto) &&
> > -		orig_head=$(cat "$apply_dir"/orig-head) &&
> > -		GIT_QUIET=$(cat "$apply_dir"/quiet)
> >  		git am --resolved --3way --resolvemsg="$RESOLVEMSG" &&
> >  		move_to_original_branch
> 
> Earlier move-to-original-branch was Ok to be called without head_name, and
> the old code here read from the file anyway, so it didn't matter, but now
> it seems that the first check and assignment you removed from the function
> may matter because this caller does not even read from head_name.  Are you
> sure about this change?

If I understand your question correctly, then yes, it is ok, because
of the previous hunk that calls read_state. That call is made
before/outside the if block for merge-based rebase, so the variables
are already set when this code is reached.
 
> > @@ -279,18 +275,15 @@ do
> >  			die "No rebase in progress?"
> >  
> >  		git rerere clear
> > -
> > -		test -d "$merge_dir" || merge_dir="$apply_dir"
> 
> My heartbeat skipped when I first saw this.  Thanks to the previous
> commit, it was exposed that somebody reused $dotest that was only to be
> used when using merge machinery because the things left to be done in this
> codepath are common between the merge and apply, which is kind of sloppy,
> but that sloppiness is now gone ;-).
> 
> Are there places that read from individual files for states left after
> this patch, or read_state is the only interface to get to the states?  If
> the latter that would be a great news, and also would suggest that we may
> want to have a corresponding write_state function (and we may even want to
> make the state into a single file to reduce I/O---but that is a separate
> issue that can be done at the very end of the series if it turns out to be
> beneficial).

There are still a few other places where state is read, mainly in
call_merge. It reads things that vary from iteration to iteration,
such as a counter. I forgot to say in the commit message, but I tried
to extract only the code that reads the initial state.

The write_state function actually is there, but it comes pretty late,
in patch 24. I don't remember why I added it so much later. I could
possibly move it closer to the beginning of this series.
 
> Of course it is fine if introduction of read_state is an attempt to catch
> most common cases, but it would reduce chances of mistake if the coverage
> were 100% (as opposed to 99.9%) hence this question.

Do you mean if all the state was read in the read_state function? I
should say that the pieces of state that are read in read_state are
not read anywhere else. But overall, the coverage is more like 60% or
so.


Thanks for a thorough review. Many of these things should have been in
the commit message. I need to get better at writing those...


Thanks,
Martin

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]