On March 13, 2022 10:41 AM, Sean Allred wrote: ><rsbecker@xxxxxxxxxxxxx> writes: >> (I am a little nervous about this advice, hoping others will chime in >> and correct anything wrong here) >> >> While this will change the commit hashes, AFAIK, the other metadata is >> preserved, including date, author, and committer. Set up the specific >> keys/settings in ssh-agent and the user.signingKey value, then: >> >> git filter-branch --commit-filter 'git commit-tree -S "$@";' >> <FROM-COMMIT>..<TO-COMMIT> >> >> Others might have a better way of doing this or may tell me this will >> not work. Test this before you do it. I have not done this operation >> before. You do need to start from the oldest commit going forward >> otherwise I think that filter-branch will (should!) invalidate child >> commits. I suspect this is going to be a rather lengthy script to build and run. > >Given the size of our history (several orders of magnitude larger than linux.git), >using git-filter-branch after the fact is certainly not ideal. The replay already takes >a week to run (we're IO-bound). We'd rather want to extend git-fast-import to >allow signing commits instead >-- which comes back to our shared 'nervousness' about this approach in >general: I don't know that Git should endorse this as a standard option. > >But yes -- hoping others can chime in with more thoughts :-) I have another reluctant suggestion, but it depends on your industry, regulations, and other factors. In some sectors, there is a requirement to keep only some period of time worth of history. In fact, in some settings, keeping user identifying information beyond, say 7 years, actually is problematic. Pruning your history may be not only an option but required. An alternative is to use filter-branch to essentially tokenize the identities of past authors and keep those in a electronic vault somewhere. I have customers who are interpreting GDPR-like rules just such as situation, where employees gone 7 years ago and cannot be retained, by name, in the repos. I am not personally happy about that, because my own repo-OCD demands that I know exactly who did what until the end of time, but according to them, it actually violates the local regulations. I'm sure you have had conversations with lawyers, yes? ☹