On Mon, Mar 1, 2021 at 12:04 PM Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx> wrote: > > > On Mon, Mar 01 2021, Elijah Newren wrote: > > > On Sun, Feb 28, 2021 at 11:44 PM anatoly techtonik <techtonik@xxxxxxxxx> wrote: > >> > >> On Sun, Feb 28, 2021 at 1:34 PM Ævar Arnfjörð Bjarmason > >> <avarab@xxxxxxxxx> wrote: > >> > > >> > I think Elijah means that in the general case people are using fast > >> > export/import to export/import between different systems or in > >> > combination with a utility like git-filter-repo. > >> > > >> > In those cases users are also changing the content of the repository, so > >> > the hashes will change, invalidating signatures. > >> > > >> > But there's also cases where e.g. you don't modify the history, or only > >> > part of it, and could then preserve these headers. I think there's no > >> > inherent reason not to do so, just that nobody's cared enough to submit > >> > patches etc. > >> > >> Is fast-export/import the only way to filter information in `git`? Maybe there > >> is a slow json-export/import tool that gives a complete representation of all > >> events in a repository? Or API that can be used to serialize and import that > >> stream? > >> > >> If no, then I'd like to take a look at where header filtering and serialization > >> takes place. My C skills are at the "hello world" level, so I am not sure I can > >> write a patch. But I can write the logic in Python and ask somebody to port > >> that. > > > > If you are intent on keeping signatures because you know they are > > still valid, then you already know you aren't modifying any > > blobs/trees/commits leading up to those signatures. If that is the > > case, perhaps you should just avoid exporting the signature or > > anything it depends on, and just export the stuff after that point. > > You can do this with fast-export's --reference-excluded-parents option > > and pass it an exclusion range. For example: > > > > git fast-export --reference-excluded-parents ^master~5 --all > > > > and then pipe that through fast-import. > > > > > > In general, I think if fast-export or fast-import are lacking features > > you want, we should add them there, but I don't see how adding > > signature reading to fast-import and signature exporting to > > fast-export makes sense in general. Even if you assume fast-import > > can process all the bits it is sent (e.g. you extend it to support > > commits without an author, tags without a tagger, signed objects, any > > other extended commit headers), and even if you add flags to > > fast-export to die if there are any bits it doesn't recognize and to > > export all pieces of blobs/trees/tags (e.g. don't add missing authors, > > don't re-encode messages in UTF-8, don't use grafts or replace > > objects, keep extended headers such as signatures, etc.), then it > > still couldn't possibly work in all cases in general. For example, if > > you had a repository with unusual objects made by ancient or broken > > git versions (such as tree entries in the wrong sort order, or tree > > entries that recorded modes of 040000 instead of 40000 for trees or > > something with perms other than 100644 or 100755 for files), then when > > fast-import goes to recreate these objects using the canonical format > > they will no longer have the same hash and your commit signatures will > > get invalidated. Other git commands will also refuse to create > > objects with those oddities, even if git accepts ancient objects that > > have them. > > > > So, it's basically impossible to have a "complete representation of > > all events in a repository" that do what you want except for the > > *original* binary format. (But if you really want to see the original > > binary format, maybe `git cat-file --batch` will be handy to you.) > > > > But I think fast-export's --reference-excluded-parents might come in > > handy for you and let you do what you want. > > ...to add to that line of thinking, it's also a completely valid > technique to just completele rewrite your repository, then (re-)push the > old signed tags to refs/tags/*. The repository in question didn't have any signed tags, just a signed commit. > By default they won't be pulled down as they won't reference commits on > branches you're fetching, and you can also stick them somewhere else > than refs/tags/*, e.g. refs/legacy-tags/*. > > None of the commit history will be the same, but the content (mostly) > will, which is usually what matters when checking out an old tag. > > Of course this hack has little benefit over just keeping a foo-old.git > repo around, and moving on with new history in your new foo.git.