Re: [PATCH v2] fast-export: fix surprising behavior with --first-parent

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Elijah Newren <newren@xxxxxxxxx> writes:

> ...  Here's what I think are the relevant points
> (and yeah, it's lengthy):
>
>
> The revision traversal machinery typically processes and returns all
> children before any parent.  fast-export needs to operate in the
> reverse fashion, handling parents before any of their children in
> order to build up the history starting from the root commit(s).  This
> would be a clear case where we could just use the revision traversal
> machinery's "reverse" option to achieve this desired affect.
>
> However, this wasn't what the code did.  It added its own array for
> queuing.  The obvious hand-rolled solution would be to just push all
> the commits into the array and then traverse afterwards, but it didn't
> quite do that either.  It instead attempted to process anything it
> could as soon as it could, and once it could, check whether it could
> process anything that had been queued.  As far as I can tell, this was
> an effort to save a little memory in the case of multiple root commits
> since it could process some commits before queueing all of them.  This
> involved some helper functions named has_unshown_parent() and
> handle_tail().  For typical invocations of fast-export, this
> alternative essentially amounted to a hand-rolled method of reversing
> the commits -- it was a bunch of work to duplicate the revision
> traversal machinery's "reverse" option.
>
> This hand-rolled reversing mechanism is actually somewhat difficult to
> reason about.  It takes some time to figure out how it ensures in
> normal cases that it will actually process all traversed commits
> (rather than just dropping some and not printing anything for them).
>
> And it turns out there are some cases where the code does drop commits
> without handling them, and not even printing an error or warning for
> the user.  Due to the has_unshown_parent() checks, some commits could
> be left in the array at the end of the "while...get_revision()" loop
> which would be unprocessed.  This could be triggered for example with
>     git fast-export main -- --first-parent
> or non-sensical traversal rules such as
>     git fast-export main -- --grep=Merge --invert-grep
>
> While most traversals that don't include all parents should likely
> trigger errors in fast-export (or at least require being used in
> combination with --reference-excluded-parents), the --first-parent
> traversal is at least reasonable and it'd be nice if it didn't just
> drop commits.  It'd also be nice to have a simpler "reverse traversal"
> mechanism.  Use the "reverse" option of the revision traversal
> machinery to achieve both.

The above is a very helpful and understandable explanation of what
is going on.  I am a bit puzzled by the very last part, though. By
"It'd also be nice to have a simpler 'reverse traversal' mechanism",
do you mean that the end users have need to control the direction
the traversal goes (in other words, they use "git fast-export" for
some thing, and "git fast-export --reverse" to achieve some other
things)?  Or do you just mean that we need to do a reverse traversal
but that is already available in the revision traversal machinery,
and not using it and rolling our own does not make sense?

> Even for the non-sensical traversal flags like the --grep one above,
> this would be an improvement.  For example, in that case, the code
> previously would have silently truncated history to only those commits
> that do not have an ancestor containing "Merge" in their commit
> message.  After this code change, that case would would include all

"would would" -> "would"

> commits without "Merge" in their commit message -- but any commit that
> previously had a "Merge"-mentioning parent would lose that parent
> (likely resulting in many new root commits).  While the new behavior
> is still odd, it is at least understandable given that
> --reference-excluded-parents is not the default.

Nicely written.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux