(note: on this mailing list, you shouldn't drop names from the cc: line when replying to a thread) On Mon, Jul 5, 2010 at 7:11 PM, Eric Niebler <eric@xxxxxxxxxxxx> wrote: > On 7/5/2010 6:04 PM, Finn Arne Gangstad wrote: >> This >> should fit eaily into a single repository. The Linux kernel is much >> larger, and that is sort of the canonical single repo git project. I >> _strongly_ recommend that you go for a single repo if you can make it >> work. > > It does fit into one repo, but that doesn't meet our needs for the > future. Users want to install and build library X and its dependencies, > not all of boost. This is increasingly becoming a problem as boost > grows. Imagine if a perl programmer had to download all of CPAN to use > or hack on any one perl module. Or if contributing to CPAN meant getting > the whole shebang, history and all. I'm sure even in the Linux kernel, > not *every* third-party driver is maintained in the master git repo. Actually, that's mostly not true; there are a few third-party drivers that don't make it into the core Linux repo, but that's mostly because they haven't been accepted by the kernel maintainers for whatever reason (often quality or duplication, I guess). The goal for the vast majority of Linux drivers is indeed to get merged into the Linux core. ...and it works pretty well, all things considered. It's certainly not the only way to do it for every project, but it's actually a pretty good way. The kernel repo history runs to hundreds of megs nowadays, but on a modern Internet connection that's not a big deal. And then you never have to worry about downloading more modules later. You also never have versioning problems. > We are aiming to make boost a clearing-house for C++ libraries (like > CPAN, or PyPi for python), turning the official boost distribution into > little more than a well-tested collection of the libraries that have > passed our peer-review and regression test process. Of course you will want to have some kind of really excellent versioned dependency fetching system (exactly like CPAN or PyPi or ruby gems) if you want this to be nice. git's submodules stuff is almost certainly not going to add any features you need/want. On the other hand, cloning a separate git repo is pretty easy to write your CPAN-like script around. > In fact, the modularization has already been done, and work is well > underway on the infrastructure to support dependency tracking. But the > modularization is not history-preserving and needs to be redone. If your code doesn't move too many files around, then splitting out the history is pretty easy with git-subtree (a tool I wrote that's not part of git): git subtree split --prefix=/path/to/subdir And you get a new history for just that subdir. That might do exactly what you want. It also works iteratively, so you can export your history from svn, then re-export the changes as they occur over time. >>> So,, what are the options? Can I somehow delete from each repository the >>> history that is irrelevant? Is these some feature of git I don't know >>> about that can solve this problem for us? >> >> How do you define "irrelevant"? Do you only require enough history for >> git annotate/blame to give correct results? Or does this only refer >> to multiple repositories sharing the same ancient history? > > If multiple repositories share the same ancient history, wouldn't that > give git annotate/blame enough information? Sorry, git newbie here. Yes, it would. But how much of the ancient history do you want? If you want all of it, you don't save any space in your repo. > The plan is to move to git. However, we don't expect this to happen > overnight, so a way to continue to pull changes from a svn mirror while > the new git repositories are being set up would be ideal. This isn't too hard to do; you just need some scripts around git-svn and git-subtree (or whatever tool you use to do the splitting). We've done this at work for a couple of years now and it's working fine. The confusing part is taking *submissions* back through both channels. If you value your sanity, you probably want to only allow submissions back via svn while you're running the two in parallel; but that makes git's added features a lot less useful, so you probably want to run in parallel for only a short time. Have fun, Avery -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html