Re: Unresolved issues #2 (shallow clone again)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 04 May 2006 01:15:03 -0700, Junio C Hamano wrote:
> * #irc 2006-04-10
>   Shallow clones (Carl Worth).
> 
>   The experiment last round did not work out very well, but as
>   existing repositories get bigger, and more projects being
>   migrated from foreign SCM systems, this would become a
>   must-have from would-be-nice-to-have.
> 
>   I am beginning to think using "graft" to cauterize history
>   for this, while it technically would work, would not be so
>   helpful to users, so the design needs to be worked out again.

I've been meaning to follow up with some thoughts on this topic, so
thanks for the tickler.

For the one use case I had, (track latest tree), I had thrown out the
idea of using "faked", parent-less commit objects to point to the tree
of interest. Junio pointed out that there's no protocol to learn the
name of a remote commit's tree from the name of the commit. I worked
around that by simply making the parent-less commit object on the
server side, (branch name of "master-shallow", say).

That seemed to work just fine, and if someone really wanted to do
this, they could use a hook to maintain the master-shallow branch,
and no change to git itself would be needed. But there's a very
minimal amount of interesting functionality in this, and it's not
clear that it's much better than git-tar-tree. So I'm considering that
idea dead.

Meanwhile, a more general ability to use shallow clones would still be
very useful. I think what I'd like to be able to do is to pass
rev-list limiting options (--max-count, --max-age via --since,
etc.). That would limit the expansion of the WANT commits, and then
the existing logic to compute the necessary objects needed to satisfy
the list of desired commits should do the right thing.

Then, in order for this to actually be useful, when returning objects
from a limited fetch like this, the server should provide a list of
commits that should be noted as cauterized, (whether through the
existing grafts mechanism or otherwise).

Additionally, when doing a fetch into a tree that has any such
cauterized commits, the client must also provide its list of
cauterized commits. So the conversation changes from "I WANT
<fetch-heads> and I HAVE <heads>" to one of "I WANT <fetch-heads>, and
I HAVE <heads>, except that I'm MISSING <cauterized-commits>".

Finally, whenever a fetch receives an commit object that is in its
list of cauterized commits, it should remove that commit from the
list. This allows a shallow clone to be naturally migrated to
something unshallow. And the user can do this as incrementally as
desired based on the need to see more history:

get a bit:
	git fetch somewhere --since=2.weeks.ago

then a bit more:
	git fetch somewhere --since=1.year.ago

then get it all:
	git fetch somewhere

Maybe that's no different from Junio's original proposal. If not, what
do you see in the above that wouldn't work?

-Carl

Attachment: pgpGuLNXEpuut.pgp
Description: PGP signature


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]