Re: [PATCH v3 3/3] fast-export, fast-import: implement signed-commits

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Apr 29, 2021 at 4:42 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Elijah Newren <newren@xxxxxxxxx> writes:
>
> > I do get that we might have to use warn-strip as the default anyway
> > just because some existing tools might rely on it, but do you have any
> > examples outside of git-filter-repo?  Given the filter-repo bug
> > reports I've gotten with users being surprised at commit signatures
> > being stripped (despite the fact that this is documented -- users
> > don't always read the documentation), I'd argue that changing to
> > --signed-commits=abort as the default is probably a good bugfix for
> > both fast-export and for filter-repo.
>
> Thanks.  The "filter-repo already gets bug reports from the users"
> is a valuable input when deciding if it is reasonable to sell the
> behaviour change as a bugfix to our users.
>
> Perhaps teaching fast-export to pay attention to two environment
> variables that say "when no --signed-{tag,commit}=<disposition>"
> command line option is given, use this behaviour" would be a good
> enough escape hatch for existing tools and their users, while they
> are waiting for their tools to get updated with the new option you
> are planning to add?

As far as git filter-repo is concerned, you can immediately introduce
--signed-commit and give it a default value of abort with no
deprecation period.  filter-repo already has to check git versions for
available command line options, so one more wouldn't hurt.  And a
default of "abort" seems more user friendly, as it gives users a
chance to be aware of and handle their data appropriately.

Of course, there are a few factors that make filter-repo more tolerant
of upstream changes: I don't expect people to user filter-repo often
(it's a once-in-a-blue-moon rewrite), I don't expect them to use it in
automated processes, I tend to make releases that coincide in timing
with git releases (so I'll just release a git-filter-repo 2.32.0 the
day you release git 2.32, and it'll come with an option to handle this
new default), and filter-repo includes the following disclaimer in its
documentation:

"""
I assume that people use filter-repo for one-shot conversions, not
ongoing data transfers. I explicitly reserve the right to change any
API in filter-repo based on this presumption (and a comment to this
effect is found in multiple places in the code and examples). You have
been warned.
"""

So, if it's just for filter-repo, then I'd say just change
fast-export's default now.  If you're concerned with
--signed-commit=abort being a changed default being too drastic for
other users or tools, then the environment variable escape hatch
sounds reasonable to me.

Personally, I'm worried users are seeing "lost" data (though they
don't notice it until weeks or months later) and are being surprised
by it, which feels like a bigger issue to me than "my automated script
isn't running anymore on this one repo, now I have to figure out what
flag to use in order to choose whether I care about that data from
that special repo being tossed or not".  So I would bias towards
throwing an error so users get a chance to handle it.

> Also, I am glad that you brought up another possible behaviour that
> Luke's patch did not add.  Exporting existing signatures that may
> become invalid and deciding what to do with them on the receiving
> end would be a good option to have.  And that would most likely have
> to be done at "fast-import" end, as a commit that "fast-export"
> expected to retain its object name if its export stream were applied
> as-is may not retain the object name when the export stream gets
> preprocessed before being fed to "fast-import".

Right, but I'd go a step further: Even if the fast-export stream is
not pre-processed before feeding to fast-import, you still cannot
always expect to get the same object names when importing the stream.

To see why, note that the fast-export stream has no way to encode tree
information.  So if trees in the original history deviated from
"normal" in some fashion, such as not-quite-sorted entries, or non
standard modes, then sending those objects through fast-export and
fast-import will necessarily result in different object names.
fast-export also may have modified other objects to normalize them,
either because of default re-encoding of commit messages into UTF-8,
because of stripping any unrecognized commit headers, or perhaps even
because it'd truncate commit messages with an embedded NUL character.

Combine all these "normalizations" that fast-export/fast-import do
with the ability for users to process the stream from fast-export to
fast-import and it becomes clear that the only stage in the pipeline
that can check the validity of the gpg signatures for the imported
history is the fast-import step.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux