Re: Round-tripping fast-export/import changes commit hashes

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, Mar 1, 2021 at 12:04 PM Ævar Arnfjörð Bjarmason
<avarab@xxxxxxxxx> wrote:
>
>
> On Mon, Mar 01 2021, Elijah Newren wrote:
>
> > On Sun, Feb 28, 2021 at 11:44 PM anatoly techtonik <techtonik@xxxxxxxxx> wrote:
> >>
> >> On Sun, Feb 28, 2021 at 1:34 PM Ævar Arnfjörð Bjarmason
> >> <avarab@xxxxxxxxx> wrote:
> >> >
> >> > I think Elijah means that in the general case people are using fast
> >> > export/import to export/import between different systems or in
> >> > combination with a utility like git-filter-repo.
> >> >
> >> > In those cases users are also changing the content of the repository, so
> >> > the hashes will change, invalidating signatures.
> >> >
> >> > But there's also cases where e.g. you don't modify the history, or only
> >> > part of it, and could then preserve these headers. I think there's no
> >> > inherent reason not to do so, just that nobody's cared enough to submit
> >> > patches etc.
> >>
> >> Is fast-export/import the only way to filter information in `git`? Maybe there
> >> is a slow json-export/import tool that gives a complete representation of all
> >> events in a repository? Or API that can be used to serialize and import that
> >> stream?
> >>
> >> If no, then I'd like to take a look at where header filtering and serialization
> >> takes place. My C skills are at the "hello world" level, so I am not sure I can
> >> write a patch. But I can write the logic in Python and ask somebody to port
> >> that.
> >
> > If you are intent on keeping signatures because you know they are
> > still valid, then you already know you aren't modifying any
> > blobs/trees/commits leading up to those signatures.  If that is the
> > case, perhaps you should just avoid exporting the signature or
> > anything it depends on, and just export the stuff after that point.
> > You can do this with fast-export's --reference-excluded-parents option
> > and pass it an exclusion range.  For example:
> >
> >    git fast-export --reference-excluded-parents ^master~5 --all
> >
> > and then pipe that through fast-import.
> >
> >
> > In general, I think if fast-export or fast-import are lacking features
> > you want, we should add them there, but I don't see how adding
> > signature reading to fast-import and signature exporting to
> > fast-export makes sense in general.  Even if you assume fast-import
> > can process all the bits it is sent (e.g. you extend it to support
> > commits without an author, tags without a tagger, signed objects, any
> > other extended commit headers), and even if you add flags to
> > fast-export to die if there are any bits it doesn't recognize and to
> > export all pieces of blobs/trees/tags (e.g. don't add missing authors,
> > don't re-encode messages in UTF-8, don't use grafts or replace
> > objects, keep extended headers such as signatures, etc.), then it
> > still couldn't possibly work in all cases in general.  For example, if
> > you had a repository with unusual objects made by ancient or broken
> > git versions (such as tree entries in the wrong sort order, or tree
> > entries that recorded modes of 040000 instead of 40000 for trees or
> > something with perms other than 100644 or 100755 for files), then when
> > fast-import goes to recreate these objects using the canonical format
> > they will no longer have the same hash and your commit signatures will
> > get invalidated.  Other git commands will also refuse to create
> > objects with those oddities, even if git accepts ancient objects that
> > have them.
> >
> > So, it's basically impossible to have a "complete representation of
> > all events in a repository" that do what you want except for the
> > *original* binary format.  (But if you really want to see the original
> > binary format, maybe `git cat-file --batch` will be handy to you.)
> >
> > But I think fast-export's --reference-excluded-parents might come in
> > handy for you and let you do what you want.
>
> ...to add to that line of thinking, it's also a completely valid
> technique to just completele rewrite your repository, then (re-)push the
> old signed tags to refs/tags/*.

The repository in question didn't have any signed tags, just a signed commit.

> By default they won't be pulled down as they won't reference commits on
> branches you're fetching, and you can also stick them somewhere else
> than refs/tags/*, e.g. refs/legacy-tags/*.
>
> None of the commit history will be the same, but the content (mostly)
> will, which is usually what matters when checking out an old tag.
>
> Of course this hack has little benefit over just keeping a foo-old.git
> repo around, and moving on with new history in your new foo.git.




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux