On Fri, Jul 23, 2010 at 3:50 PM, Avery Pennarun <apenwarr@xxxxxxxxx> wrote: > Honest question: do you care about the wasted disk space and download > time for these extra files? Or just the fact that git gets slow when > you have them? I have the similar situation to the original poster (huge trees) and for me it's all three: disk space, download time, and performance. My tree has a few relatively small (< 20 MB) shared directories of common code, a few large (2-6 GB) directories of code for OS's, and then several medium size (< 500 MB) directories for application code. The application developers only care about the app+shared directories (and are very annoyed by the massive space and performance impact of the OS directories). The firmware-only developers only care about OS+shared and are mildly annoyed by the medium space and performance impact of the app directories. I work on all of the pieces, but even I would prefer to have things separated so when I work on the apps, git status/etc doesn't take a big hit for close to a million files in the OS directories (particularly when doing git status on Windows). Even when using the -uno option to git status, it's still pretty slow (over a minute). git-submodule might be technically possible in this situation, but having to commit and push each submodule and then commit and push the super module makes it slightly worse than just dealing with the space/download/performance issues of one huge repository. git-subtree could also possibly help, but there's still extra work to split and merge each repository. And I'm not sure how it handles commit IDs across the repositories because I want to be able to say "I fixed that bug in shared/code.c in commit abc123" and have both the OS+shared and the apps+shared people be able git log abc123 and see the same change (and merge/cherry-pick/etc.). I think what I want is a way to do a sparse checkout where some sort of module is maintained in the git repository (probably just an INI-style file with paths) so I can clone directly from the server and it figures out the objects I need for the full history of only apps+shared (or firmware+shared, etc.) on the server side and only sends those objects. I still want to be able to branch, tag, and refer to commit IDs. So I only take the space/download/performance hit of directories included in the module, but I don't have to manually maintain that view of the repository (as I do with git-submodule and git-subtree). The closest thing to that so far for me has been the sparse checkout support added in git 1.7 combined with a convenience script I wrote. Everyone still has a huge download and .git directory, but at least the working copy is limited to the paths specified in the module so git status isn't super slow (although just having all those objects in the .git directory still slows it down quite a bit). -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html