Re: [PATCH 11/14] replay: use standard revision ranges

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Apr 14, 2023 at 7:23 AM Derrick Stolee <derrickstolee@xxxxxxxxxx> wrote:
>
> On 4/14/2023 10:09 AM, Derrick Stolee wrote:
> > On 4/7/2023 3:24 AM, Christian Couder wrote:
> >> From: Elijah Newren <newren@xxxxxxxxx>
> >>
> >> Instead of the fixed "<oldbase> <branch>" arguments, the replay
> >> command now accepts "<revision-range>..." arguments in a similar
> >> way as many other Git commands. This makes its interface more
> >> standard and more flexible.
> >
> > Unfortunately, while doing this, you have broken the --onto
> > logic:
> >
> >  $ git replay --onto HEAD~2 HEAD~1 HEAD
> >  fatal: replaying down to root commit is not supported yet!
> >
> > The rev-walk you are supplying by this line...
> >
> >> +    argc = setup_revisions(argc, argv, &revs, NULL);
> >
> > is taking the remaining arguments and using them as tips to
> > walk. We need to be able to recognize that --onto A B C means
> > that A is the new base and our walk is B..C.
>
> I'm realizing after hitting "send" that this change is
> intentional (based on your test updates). I don't agree with
> it, though.
>
> Sending arbitrary command-line arguments to setup_revisions()
> creates an opportunity for behavior you are not expecting.

Note that fast-export has always accepted the full range of
setup_revisions() flags, even though many options make no sense.
Perhaps we could tighten up the parsing for fast-export, but limiting
fast-export to three commits in order to avoid other nonsensical
combinations would destroy its utility.

You may think I've gone on a tangent, but I think it's an important
comparison point here.  I think limiting replay to three commits does
the same: destroys its intended utility.  replay is a patch-based
analogue to fast-export, and we need a wide range of the revision
ranges for it to work.

(There are other interesting comparisons here too.  For example, if
someone specifies commits rather than branches to fast-export, then
things break down: new commits may be written but nothing takes
advantage of them so they merely become garbage to be gc'ed later.
Likely not what the user wanted.  The exact same issue existed for
replay.  However, in replay's case I tried to add some sanity checking
for it -- perhaps that sanity checking would be useful to copy to
fast-export?)

> For instance, can users pass multiple ranges?

There's no such thing as multiple ranges for most commands (see
f302c1e4aa0 ("revisions(7): clarify that most commands take a single
revision range", 2021-05-18))

However, users should absolutely be allowed to specify something like

  $ git replay --onto new-base master..my-topic some-base..other-topic

which is one revision range, and I'd expect replay to take commits in
the history of either my-topic or other-topic, which are not in the
history of either master or some-base, and replay them.  It'd be
equivalent, in most cases, to the separate commands:

   $ git replay --onto new-base ^master ^some-base my-topic
   $ git replay --onto new-base ^master ^some-base other-topic

The first of the two commands above, for example, would replay the
commits in my-topic back to nearer of (master, some-base) and do the
replaying on top of new-base.  The second command is similar, but with
my-topic replaced by other-topic.

Notice, though, that I said most cases.  There's an important case
where the single command is different from the two commands: when
my-topic and other-topic share common commits (which is not also
shared by master and some-base), we want the replayed common history
to remain common.  If you do two separate commands, the replayed
common history is no longer common (those commits will instead be
duplicated with separate committer dates).  So, supporting the single
command form is an intrinsic design consideration for git-replay.

> Can users supply --first-parent?

Absolutely, and it might be a useful replacement for or supplement to
--[no-]rebase-cousins.

Also, we definitely want to allow --ancestry-path and
--ancestry-path=<commit>, the latter of which was specifically written
for use in git-replay.

In fact, --first-parent sometimes might be equivalent to a particular
specialization of --ancestry-path=<commit>, so yeah, I definitely want
to allow it.

> What happens if they add an --author filter?

git fast-export accepts full revision ranges, including ones that make
no sense like this.  We could add patches to exclude such options, to
both replay and fast-export.  Or just document that they make no
sense.

To answer the question as asked, though, let me first provide a little
background: for a commit A with parent B, git-replay checks whether
the parent of the current commit (i.e. for A that'd be B) has been
replayed (as B').  If so, it will make the parent of A' be B'.  If
there is no B' (e.g. because B was part of "upstream" and thus wasn't
replayed), it makes the parent of A' be whatever is passed to --onto.
This allows git-replay to not only rebase/replay a linear set of
patches, but a partially or even fully divergent set of branches.

Now, with that background, if someone passes --author, there will be a
lot of parent commits that aren't replayed, resulting in each new
"first" commit being rebased on top of --onto, and only the final
one(s) being referenced by the updated ref commands.  In other words,
it'll leave lots of orphaned commits behind and only keep the "newest"
contiguous ones belonging to the author.  In simpler terms, passing
--author to git-replay would "make a really big mess of things" from
the user perspective, but an implementationally well-defined mess.
This, again, is similar to passing --author to fast-export -- that
tool has very different primitives involved causing it to make a
completely different big mess of things (and perhaps even bigger mess
of things), but the mess is at least implementationally well-defined
to those that understand the data structures and implementation.  (And
yes, I thought of this exact case while developing the tool, for what
it's worth.)

Does that mean I want to allow people to pass --author to fast-export
or replay?  I'd be perfectly happy with patches that prohibit it in
both tools.  But I'm also not worried about people off-roading and
trying it and getting non-sense.  A documentation patch stating that
using options that "disconnect" history make no sense with either
replay or fast-export might be the easiest (and most future-proof)
solution.

> [unstated: and what about other similar options?]

I'd really rather not have an allowlist of which revision options
should be allowed for use by git-replay.  A denylist might be okay, if
also implemented for fast-export, but I'm more of the opinion that we
just document that both commands only make sense with "contiguous"
history.

Most importantly, though, at a minimum I do absolutely want to allow
several negative revisions to be specified, several positive revisions
to be specified, and special flags like --ancestry-path or
--first-parent to be specified.  There may well be additional flags,
both current and future, that should be allowed too.

In short, I don't want another limited rebase; I want a more generic tool.


Hope that helps,
Elijah




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux