Johannes Schindelin wrote:
Hi,
On Fri, 9 Feb 2007, Rogan Dawes wrote:
Johannes Schindelin wrote:
On Fri, 9 Feb 2007, Christoph Duelli wrote:
Is it possible to restrict a chechout, clone or a later pull to some
subdirectory of a repository?
No. In git, a revision really is a revision, and not a group of file
revisions.
I thought about how this might be implemented, although I'm not entirely
sure how efficient this will be.
There are basically three ways I can think of:
- rewrite the commit objects on the fly. You might want to avoid the use
of the pack protocol here (i.e. use HTTP or FTP transport).
- try to teach git a way to ignore certain missing objects and
directories. This might be involved, but you could extend upload-pack
easily with a new extension for that.
(my favourite:)
- use git-split to create a new branch, which only contains doc/. Do work
only on that branch, and merge into mainline from time to time.
If you don't need the history, you don't need to git-split the branch.
You only need to make sure that the newly created branch is _not_ branched
off of mainline, since the next merge would _delete_ all files outside of
doc/ (merge would see that the files exist in mainline, and existed in the
common ancestor, too, so would think that the files were deleted in the
doc branch).
Ciao,
Dscho
Your third option sounds quite clever, apart from the problem of
attributing a commit and a commit message to someone, when the actual
commit doesn't match what they actually did :-(
As well as wondering what happens when they check out a few more files.
Do we rewrite those commits as well? What happens if the user has made
some commits already? What happens if they have already sent those
upstream? etc.
I think the best solution is ultimately to make git able to cope with
certain missing objects.
I started writing this in response to another message, but it will do
fine here, too:
The description I give here will likely horrify people in terms of
communications inefficiency, but I'm sure that can be improved.
Scenario:
A user sees a documentation bug in a git-managed project, and decides
that she wants to do something about it. Since she is not on the fastest
of connections, she'd like to reduce the checkout to a reasonable
minimum, while still working with the git tools.
Viewing the repo layout using gitweb, she sees that all the
documentation is stored in the docs/ directory from the root.
So, she creates a local repo to work in:
$ git init-db
She configures her local repo to reference the source one:
(Hypothetical syntax)
$ git clone --reference http://example.com/project.git \
http://example.com/project.git
Since the reference and repo are the same (and non-local), git doesn't
actually download anything, other than the current heads (and maybe tags).
She then does a partial checkout of the master branch, but only the
docs/ directory:
$ git checkout -p master docs/
The -p flag indicates that this is a partial checkout of master. Git
records that the current HEAD is "master", checks out the docs/
directory, and removes any other files in the working directory (that it
knew about from the existing index, if any - I'm not suggesting that it
should arbitrarily delete files!)
The checkout process goes as follows: Resolve the <treeish> that HEAD
points to, and retrieve it from the upstream repo if it does not exist
locally. Continue requesting only the necessary tree and blob objects to
satisfy the requested checkout. i.e. From the first tree, identify the
docs/ directory. Then request only that tree object. Continue to
download tree and blob objects until the entire docs/ directory can be
created in the working directory.
This will likely require a new index file format, that simply stores the
hashes of objects (blobs or trees) that have not been checked out, as
well as the current file's stat information.
Now create a "negative index" (pindex?) that has details about the other
files and directories that were not checked out. Obviously, this does
not need to recurse into directories that were not checked out. Simply
having the hash of the parent directory in the pindex is sufficient
information to reconstruct a new index. (This might require a new index
format that does not include all known files, but simply stores the hash
of the unchecked-out tree or blob.)
Then creating a new commit would require creating the necessary blobs
for changed files, new tree objects for trees that change, and a commit
object.
As far as I can tell, that could then be pushed/pulled/merged using the
existing tools, without any problems.
Rogan
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html