RE: Dealing with corporate email recycling

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On March 13, 2022 10:41 AM, Sean Allred wrote:
><rsbecker@xxxxxxxxxxxxx> writes:
>> (I am a little nervous about this advice, hoping others will chime in
>> and correct anything wrong here)
>>
>> While this will change the commit hashes, AFAIK, the other metadata is
>> preserved, including date, author, and committer. Set up the specific
>> keys/settings in ssh-agent and the user.signingKey value, then:
>>
>> git filter-branch --commit-filter 'git commit-tree -S "$@";'
>> <FROM-COMMIT>..<TO-COMMIT>
>>
>> Others might have a better way of doing this or may tell me this will
>> not work. Test this before you do it. I have not done this operation
>> before. You do need to start from the oldest commit going forward
>> otherwise I think that filter-branch will (should!) invalidate child
>> commits. I suspect this is going to be a rather lengthy script to build and run.
>
>Given the size of our history (several orders of magnitude larger than linux.git),
>using git-filter-branch after the fact is certainly not ideal.  The replay already takes
>a week to run (we're IO-bound).  We'd rather want to extend git-fast-import to
>allow signing commits instead
>-- which comes back to our shared 'nervousness' about this approach in
>general: I don't know that Git should endorse this as a standard option.
>
>But yes -- hoping others can chime in with more thoughts :-)

I have another reluctant suggestion, but it depends on your industry, regulations, and other factors. In some sectors, there is a requirement to keep only some period of time worth of history. In fact, in some settings, keeping user identifying information beyond, say 7 years, actually is problematic. Pruning your history may be not only an option but required. An alternative is to use filter-branch to essentially tokenize the identities of past authors and keep those in a electronic vault somewhere. I have customers who are interpreting GDPR-like rules just such as situation, where employees gone 7 years ago and cannot be retained, by name, in the repos. I am not personally happy about that, because my own repo-OCD demands that I know exactly who did what until the end of time, but according to them, it actually violates the local regulations. I'm sure you have had conversations with lawyers, yes? ☹




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux