[RFC] Malicously tampering git metadata?

Santiago Torres <santiago@xxxxxxx> · Tue, 15 Dec 2015 22:26:39 -0500

Hello everyone,

I'm Santiago, a PhD student at NYU doing research about secure software
development pipelines. We've been studying different aspects of Git
lately, (as it is an integral part of many projects) and we believe
we've found a vulnerabilty in the way Git structures/signs metadata. 

An attacker capable of performing as a Man in the Middle between a
GitHub server and a developer is able to trick such developer into
merging vulnerable commit objects, or omit security patches --- even if
all users sign all commit objects. Given that Git metadata is unsigned,
it can be modified to provide incorrect views of a repository to
downstream developers.

An example of a malicious commit merge follows:

1) The attacker controlling or acting as the upstream server identifies
two branches: one in which the unsuspecting developer is working on, and
another in which a vulnerable piece of code is located.

2) Branch pointers are modified: the packed-refs file (or ref/heads/*)
is edited so that the master branch points to the vulnerable commit
object. Having performed the change, no additional configuration must be
made by the attacker, who now waits for an unsuspecting developer to
pull.

3) Once a developer pulls, he or she will be prompted to merge his code
with the new change-set (the vulnerable commit). This operation will
only resemble developer negligence. If no conflicts arise, the attack
will pass unsuspected.

4) The developer pushes to upstream. All the traffic can be re-routed
back to the original repository. The target branch now contains a
vulnerable piece of code.

We have identified additional attack scenarios for modifying the
metadata that result in a incorrect state of the target repository, and
we are ready to disclose information about other variants of this attack
as well.

We also designed a backwards-compatible defense mechanism to prevent
attacks based on Git metadata tampering. Also we implemented a proof of
concept of the scheme, and performed timing, stress and concurrency
tests; our results show that the overhead should be minimal, even in
large software repositories such as the Linux Kernel.

We already approached people from CERT and GitHub regarding this attack
scenario, and we'd also like to hear your comments regarding this.

Thanks!
-Santiago.

P.S. We also elaborate more about this attack vector in this document: 
https://drive.google.com/a/nyu.edu/file/d/0B2KBm0fULlS1RDR5UHVESjVua3M/view?usp=sharing
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html