Re: Security problem

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On Fri, 16 Jun 2006, Alexander Litvinov wrote:
>
> You are right, I trust my file system. But if our team had central repo with 
> ssh access to that machine, every developer can hack central repo.
> 
> Whould git-clone/git-fetch warn me about this ?

Using the native protocol, yes. Using rsync, unless you explicitly fsck 
the result, no.

> It can't checkout object (3609f20ebd357679b111783e8afaf36ec46427f3 is the 
> original file). It seems packed repos are safe from this point.

Well, they may not be "safe" - you just need to work a _lot_ harder to 
corrupt a pack-file in any interesting manner. And again, git-fsck-objects 
would pick up any such thing going on.

Anyway, what it boils down to is that anybody who has write access to a 
particular repository can certainly change the repo in "interesting" ways. 

However, there are various inherent safety valves in place that make it 
really hard to corrupt on a bigger scale.

The first is that git-fsck-objects will definitely find any repository 
inconsistency, and to get around that, you either have to get around the 
basic properties of SHA-1 (ie break the hash) _or_ you have to actually 
change the repository so that it's still a valid repo, just with different 
content.

So let's take a look at those two cases:

 - if you corrupt the repository, subsequent clones (or even pulls) from 
   the corrupt repository simply won't work if you use the native 
   protocol, because the native protocol doesn't actually trust anything 
   but the actual contents (so if the contents won't match, then neither 
   will the SHA1 names). So the corruption is pretty strictly limited to 
   the _one_ repository that the attacker had write access to.

   So there's a pretty fundamental "corruption containment" part there.

   (Side note: there's no question that we might well be able to do 
   better. A _malicious_ server could actually send a corrupt pack, and 
   it's possible that a properly corrupted remote archive could cause even 
   a "good" git-send-pack to just silently send a corrupt pack, so that 
   you'd need to use "git-fsck-objects" on the receiving side to notice 
   that you are missing objects, for example)

 - if the repository is good (ie fsck is fine), then obviously a "git 
   pull" will also succeed. However, you can't _hide_ the data the way you 
   tried to do: when the receiver checks out the most recent version, it 
   will definitely use the data in the object, there's no way to get the 
   server to serve different data in objects and in the working tree 
   (because the server literally doesn't even send the working tree at 
   all).

   So you can always convince somebody to pull from an "evil repository", 
   and that's no different from committing a bug by mistake. But at least 
   you can't try to hide the bug just in the object store and have it not 
   show up in diffs and in checked-out copies.

The latter case is true even with http and rsync, the actual pull event 
always pulls just the database, never any checked-out state (in fact, 
the common case is obviously to pull from a bare repository that doesn't 
even _have_ checked-out state). So you can't hide things in the index or 
in the checked-out state except in the filesystem that you have direct 
write access to.

But yeah, I actually still personally do a fair number of 
"git-fsck-objects". I've never found anything that way since very early on 
(and back then, the real problem was rsync getting objects that weren't 
reachable), but I still do it. It makes me feel happier.

Of course, bugs always happen. But I can pretty much guarantee that git is 
fundamentally harder to corrupt than most things. We've had git-fsck-cache 
since April 8th last year (or, put another way, literally since "Day 2" in 
git terms - it's the eight commit in the whole git history).

Git also has an almost total lack of redundant information. There's 
basically no "duplicate" information in the repository format itself where 
you could hide something so that it wouldn't be noticed.

In a checked-out project, the checked-out state itself is "duplicate 
information" (and that was where your "attack" tried to hide things), and 
there's the index (which is actually a much better and subtle place to 
hide things ;). But neither of them have any life outside of that 
particular repository.

			Linus
-
: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]