On Wed, Apr 1, 2020 at 4:29 PM Ian Jackson <ijackson@xxxxxxxxxxxxxxxxxxxxxx> wrote: > > Hi. Thanks for looking at this. > > Elijah Newren via GitGitGadget writes ("[PATCH] sequencer: honor GIT_REFLOG_ACTION"): > > I'm not the best with getenv/setenv. The xstrdup() wrapping is > > apparently necessary on mac and bsd. The xstrdup seems like it leaves us > > with a memory leak, but since setenv(3) says to not alter or free it, I > > think it's right. Anyone have any alternative suggestions? > > I can try to help. It's not entirely trivial. > > The setenv interface is a wrapper around putenv. putenv has had a > variety of different semantics. Some of these sets of semantics > cannot be used to re-set the same environment variable without a > memory leak - and even figuring out what semantics you have would be > complex and tend to produce code which would fail in bad ways. > There's a short summary of the situation in Linux's putenv(3). > > Would it be possible for git to arrange to set GIT_REFLOG_ACTION only > when it is invoking subprocesses ? Otherwise it would update, and > look at, a global variable of its own. (Or a parameter to relevant > functions if one doesn't like the action-at-a-distance effect of a > global.) > > And, it seems to me that the reflog handling should be centralised. > > > + char *reflog_action = getenv("GIT_REFLOG_ACTION"); > > > > va_start(ap, fmt); > > strbuf_reset(&buf); > > - strbuf_addstr(&buf, action_name(opts)); > > + strbuf_addstr(&buf, reflog_action ? reflog_action : action_name(opts)); > > Open coding this kind of thing at every site which needs to think > about the reflog actions will surely result in some of the instances > having bugs. > > Writing a single function that contans this (or most of it) would > happily decouple all of its call sites from literally asking about > getenv("GIT_REFLOG_ACTION") thereby making it easier to do the > indirection-through-program-variables I suggest. That sounds great, but I'm not sure that "only when invoking subprocesses" will limit the places where we set the environment variable all that much; it might actually expand it. I wasn't there for the whole history, but my understanding is the rebase code has slowly transformed from the original all-shell rebase implementation(s), to being a helper program that the shell could call into for parts of its operations and passing control back and forth between shell and C, to being a reimplementation of just invoking the same commands that the shell script would have, to slowly transforming into an actual library where invocations of other git subprocesses are being replaced with relevant function calls. It's a long cleanup process that is still ongoing. I'd like to get to the point where we only invoke subprocesses if the user specifies --exec or a special merge strategy, but that's a goal with a longer term timeframe than fixing a 2.26 regression. > Having said that, > > > diff --git a/t/t3406-rebase-message.sh b/t/t3406-rebase-message.sh > > index 61b76f33019..927a4f4a4e4 100755 > > --- a/t/t3406-rebase-message.sh > > +++ b/t/t3406-rebase-message.sh > > This test case convinces me that the patch has the right behaviour for > at least the case I care about :-). Cool, sounds like it's a good immediate fix for the 2.26 regression, and then longer term as we continue refactoring we can hopefully isolate subprocess handling and writing of state. As a heads up, though, my personal plans for rebase (subject to buy-in from other stakeholders) is to make it do a lot more in-memory work. In particular, this means for common cases there will be no subprocess invocations, no writing of any state unless/until you hit a conflict, no updating of any files in the working tree until all commits have been created (or a conflict is hit), and no updating of the branch until after all the commits have been created. Thus, for the common cases with no conflicts, there would only be 1 entry in the reflog of HEAD the entire operation, rather than approximately 1 per commit. I have a proof-of-concept showing these ideas work for basic cases. So, I hope your tests don't depend on the number of entries added to HEAD's reflog.