On Mon, May 25, 2009 at 1:35 PM, Asger Ottar Alstrup <asger@xxxxxxxx> wrote: > On Mon, May 25, 2009 at 5:50 PM, Avery Pennarun <apenwarr@xxxxxxxxx> wrote: >> On Mon, May 25, 2009 at 5:33 AM, Asger Ottar Alstrup <asger@xxxxxxxx> wrote: >>> No, that is unfortunately not so easy. If we could, I suppose we could >>> use submodules instead. >> >> Your only option may be to use git filter-branch then. It lets you do >> pretty much anything you want, although merging it back together again >> could be entertaining. (Making it correctly mergeable is by far the >> trickiest part of git-subtree.) > > OK, so git subtree is not usable as it is for this. Instead, it seems > a new system has to be developed which would be similar to git subtree > in spirit, except that it worked at a file-level. Of course, the git > merge subtree strategy can not be used, so merging has to be done > differently. That sounds about right. > So a poor mans system could work like this: > > - A reduced repository is defined by a list of paths in a file, I > guess with a format similar to .gitignore Are you sure you want to define the list with exclusions instead of inclusions? I don't really know your use case. Anyway, if you're using git filter-branch, it'll be up to you to fix the index to contain the list of files you want. (See man git-filter-branch) > - To extract: A copy of the original repository is made. This copy is > reduced using git filter-branch. Is there some way of turning a > .gitignore syntax file into a concrete list of files? Also, can this > entire step be done in one step without the copy? Having to copy the > entire project first seems excessive. Will filter-branch preserve > and/or prune pack files intelligently? You probably need to read about the differences between git trees, blobs, and commits. You're not actually "copying" anything; you're just creating some new directory structures that contain the *existing* blobs. And of course the existing blobs are in your existing packs. This is a pretty good introduction: http://eagain.net/articles/git-for-computer-scientists/ > - To merge from the reduced to the original: The very simple version > is just to copy all the files from the reduced repository into a > checkout of the original repository, and then merge. This would not > support removal (or renaming) of files, but that might be ok in my > setup. If this needs to be more intelligent, the list of files in the > reduced repository could be compared with the list of paths that was > used to reduce it originally. This can be used to detect removals and > additions of files. Yes. In the slightly fancier version of this, you could just do all your merges from subset->main and never from main->subset, and then a simple "git merge subset" would handle the above comparison, additions, and removals for you. > - To merge from the original to the reduced: First merge the other > way, and then extract again. Yes. > I am new to git, so please excuse me if this design is mentally unsound. Well, you're getting pretty far out there: - git subtree is already an experimental tool that hasn't been accepted by most people; - you're doing something similar to git subtree, but even more complicated; - git is known to work badly with large files, and you have a bunch of large files; - git is intended to manage entire repositories at a time, and you want a partial checkout; - git is intended to download the entire history at once, and you (I think) only want part of it. By the time you're this far out, maybe what you want isn't git at all. svn would work fine with this arrangement, and people who want partial checkouts would rarely benefit from git's distributedness anyway, I expect. Have fun, Avery -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html