Re: [PATCH] sequencer: honor GIT_REFLOG_ACTION

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Apr 1, 2020 at 4:29 PM Ian Jackson
<ijackson@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>
> Hi.  Thanks for looking at this.
>
> Elijah Newren via GitGitGadget writes ("[PATCH] sequencer: honor GIT_REFLOG_ACTION"):
> >     I'm not the best with getenv/setenv. The xstrdup() wrapping is
> >     apparently necessary on mac and bsd. The xstrdup seems like it leaves us
> >     with a memory leak, but since setenv(3) says to not alter or free it, I
> >     think it's right. Anyone have any alternative suggestions?
>
> I can try to help.  It's not entirely trivial.
>
> The setenv interface is a wrapper around putenv.  putenv has had a
> variety of different semantics.  Some of these sets of semantics
> cannot be used to re-set the same environment variable without a
> memory leak - and even figuring out what semantics you have would be
> complex and tend to produce code which would fail in bad ways.
> There's a short summary of the situation in Linux's putenv(3).
>
> Would it be possible for git to arrange to set GIT_REFLOG_ACTION only
> when it is invoking subprocesses ?  Otherwise it would update, and
> look at, a global variable of its own.  (Or a parameter to relevant
> functions if one doesn't like the action-at-a-distance effect of a
> global.)
>
> And, it seems to me that the reflog handling should be centralised.
>
> > +     char *reflog_action = getenv("GIT_REFLOG_ACTION");
> >
> >       va_start(ap, fmt);
> >       strbuf_reset(&buf);
> > -     strbuf_addstr(&buf, action_name(opts));
> > +     strbuf_addstr(&buf, reflog_action ? reflog_action : action_name(opts));
>
> Open coding this kind of thing at every site which needs to think
> about the reflog actions will surely result in some of the instances
> having bugs.
>
> Writing a single function that contans this (or most of it) would
> happily decouple all of its call sites from literally asking about
> getenv("GIT_REFLOG_ACTION") thereby making it easier to do the
> indirection-through-program-variables I suggest.

That sounds great, but I'm not sure that "only when invoking
subprocesses" will limit the places where we set the environment
variable all that much; it might actually expand it.  I wasn't there
for the whole history, but my understanding is the rebase code has
slowly transformed from the original all-shell rebase
implementation(s), to being a helper program that the shell could call
into for parts of its operations and passing control back and forth
between shell and C, to being a reimplementation of just invoking the
same commands that the shell script would have, to slowly transforming
into an actual library where invocations of other git subprocesses are
being replaced with relevant function calls.  It's a long cleanup
process that is still ongoing.  I'd like to get to the point where we
only invoke subprocesses if the user specifies --exec or a special
merge strategy, but that's a goal with a longer term timeframe than
fixing a 2.26 regression.

> Having said that,
>
> > diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh
> > index 61b76f33019..927a4f4a4e4 100755
> > --- a/t/t3406-rebase-message.sh
> > +++ b/t/t3406-rebase-message.sh
>
> This test case convinces me that the patch has the right behaviour for
> at least the case I care about :-).

Cool, sounds like it's a good immediate fix for the 2.26 regression,
and then longer term as we continue refactoring we can hopefully
isolate subprocess handling and writing of state.

As a heads up, though, my personal plans for rebase (subject to buy-in
from other stakeholders) is to make it do a lot more in-memory work.
In particular, this means for common cases there will be no subprocess
invocations, no writing of any state unless/until you hit a conflict,
no updating of any files in the working tree until all commits have
been created (or a conflict is hit), and no updating of the branch
until after all the commits have been created.  Thus, for the common
cases with no conflicts, there would only be 1 entry in the reflog of
HEAD the entire operation, rather than approximately 1 per commit.  I
have a proof-of-concept showing these ideas work for basic cases.  So,
I hope your tests don't depend on the number of entries added to
HEAD's reflog.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux