Re: [PATCH 6/6] upload-pack: provide a hook for running pack-objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, May 19, 2016 at 12:12:43PM +0200, Ævar Arnfjörð Bjarmason wrote:

> On Thu, May 19, 2016 at 12:45 AM, Jeff King <peff@xxxxxxxx> wrote:
> >   3. You may want to insert a caching layer around
> >      pack-objects; it is the most CPU- and memory-intensive
> >      part of serving a fetch, and its output is a pure
> >      function[1] of its input, making it an ideal place to
> >      consolidate identical requests.
> 
> Cool to see this on the list after we talked briefly about this at Git
> Merge. Being able to cache this so simply is a great optimization.
> 
> As I recall you guys at GitHub ended up writing your own utility to
> cache output depending on stdin/argv because none existed already.

Yeah, we do have such a tool internally. It's possible we may one day
open-source that, but there aren't plans to do so right now.

I don't know whether this kind of caching would be useful to most sites
or not. It's good if you have lots of clients asking you for the same
thing at roughly the same time (say, somebody using "git pull" as a
deploy mechanism from their AWS cluster), but otherwise not.

> So do I understand correctly that you're trying to guard against the
> case where you e.g.:
> 
>     rsync untrusted.example.com:/tmp/poison.git /tmp/
>     git clone /tmp/poison.git /tmp/safe.git
> 
> Not hosing your system if the poison.git/config has a
> uploadpack.packObjectsHook that's "sudo rm -rf /".

I'm not that worried about this case, as it's just not that common.  I
think we're more concerned with two cases:

  1. multi-user servers where you ssh as yourself, but then access
     repositories owned by somebody else. This is basically the ssh case
     you described later.

  2. hosting sites that run git-daemon as the "daemon" user, but serve
     repositories owned by random untrusted users (where you would not
     want those users to run arbitrary code as "daemon").

> We've already accepted that "push" hooks like the pre-receive or
> update hook can do something malicious like this, so on one hand maybe
> we should say if you scp raw *.git repositories with hooks this sort
> of thing might happen, or if you ssh to a remote box and run their
> per-repo hooks it's really their problem to make sure their users
> don't run malicious hooks on your behalf.

Yeah, we make no promises for repositories that you push to. It's _only_
for the fetching side. It's kind of a funny distinction, but it's one we
have maintained since the beginning of git, and I do think there are
real sites that depend on it (see, e.g., the history of the
post-upload-pack hook added in the v1.6.x time frame).

Rsyncing a repository is generally of questionable safety. It's OK to
fetch from the result, but certainly not to run "git log" (which can run
arbitrary commands via external diff, etc).

> But as you point out this makes the hook interface a bit unusual.
> Wouldn't this give us the same security and normalize the hook
> interface:
> 
>  * Don't do the uploadpack.packObjectsHook variable, just have a
> normal "pack-objects" hook that works like any other git hook
>  * By default we don't run this hook unless core.runDangerousHooks (or
> whatever we call it) is true.
>  * The core.runDangerousHooks variable cannot be set on a per-repo
> basis using your new config facility.
>  * If there's a pack-objects hook and core.runDangerousHooks isn't
> true we warn "not executing potentially unsafe hook $path_to_hook" and
> carry on

This is the "could we just set a bool" option I discussed in the commit
message. The problem is that it doesn't let the admin say "I don't trust
these repositories, but I _do_ want to run just this one hook when
serving them, and not any other hooks".

> This would allow use-cases that are a bit inconvenient with your patch
> (again, if I'm understanding it correctly):
> 
>  * I can set core.runDangerousHooks=true in /etc/gitconfig on my git
> server because I also control all the repos, and I want to experiment
> with trying this on a per-repo basis for users that are cloning from
> me.
>  * I can similarly play with this locally knowing I'm only cloning
> repos I trust by setting core.runDangerousHooks=true in ~/.gitconfig

Yes, those use cases are not well served by the git config alone. But
you can do them (and much more) once your trusted hook is running (by
checking $GIT_DIR, or looking in a database, or whatever you want).

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]