Re: Fetching SHA id's instead of named references?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Johannes Schindelin <Johannes.Schindelin@xxxxxx> wrote:
> On Mon, 6 Apr 2009, Klas Lindberg wrote:
> 
> > Is there a way to fetch based on SHA id's instead of named references?
> 
> No, out of security concerns;  imagine you included some proprietary 
> source code by mistake, and undo the damage by forcing a push with a 
> branch that does not have the incriminating code.  Usually you do not 
> control the garbage-collection on the server, yet you still do not want 
> other people to fetch "by SHA-1".
> 
> BTW this is really a strong reason not to use HTTP push in such 
> environments.

Err, you mean http:// and rsync:// fetch, don't you?  Because if
you rely on being able to unpublish a ref you have to use only the
native git://, where direct access is otherwise forbidden.

Anyway.

The fetch-pack/upload-pack protocol uses SHA1s in the want commands,
so in theory at the protocol level you can say "git fetch URL SHA1"
and convey your request to the remote peer.

The problem is, upload-pack won't perform a reachability analysis
to determine if a wanted SHA1 is reachable from a current ref.
Instead it requires that the wanted SHA1 is *exactly* referenced
by at least one ref.

I had previously proposed adding a merge base test if SHA1 parses
as a commit, but IIRC Junio rejected the idea, saying it was too
costly to perform on the server.

The thing is, he's right.

There's no reason to perform the reachability test on the server
when you can move it onto the client, and that's exactly what
git-submodule is doing.  It fetches everything, and then assumes
its reachable post fetch.  Since the client has fetched everything,
the client has the object if its reachable by the server.

If the object is no longer reachable by the server's refs (think
branch rebased) then the object is actually in danger of being GC'd
off of the server's object store.  So you already are going to be
playing with fire, even if we added a server side config to permit
fetching of unreachable data.  A future "git gc" on that server
repository could suddenly wipe out that data entirely.


Klas, one suggestion might be to make a "refs/heads/world" ref which
has a threaded chain of merges of every commit you ever recorded
in the supermodule, and then you can assume post fetch that the
world is reachable.

E.g. every time you want to record a commit in the manifest file,
also shove it into the world:

  C=...commit.to.save... &
  W=$(git rev-parse refs/heads/world) &&
  git update-ref refs/heads/world \
    $(echo Save $C, save the world | git commit-tree $W -p $W -p $C) \
    $W &&
  git push URL refs/heads/world

One way we get away with this sort of thing in repo is, we only
put SHA1s in our manifest that are published in branches that
won't ever rewind or delete.  Hence, its a moot point.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]