Re: help moving boost.org to git

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 7/5/2010 7:32 PM, Avery Pennarun wrote:
> (note: on this mailing list, you shouldn't drop names from the cc:
> line when replying to a thread)

Noted, thanks.

> On Mon, Jul 5, 2010 at 7:11 PM, Eric Niebler <eric@xxxxxxxxxxxx> wrote:
>> On 7/5/2010 6:04 PM, Finn Arne Gangstad wrote:
>>> This
>>> should fit eaily into a single repository. The Linux kernel is much
>>> larger, and that is sort of the canonical single repo git project. I
>>> _strongly_ recommend that you go for a single repo if you can make it
>>> work.
>>
>> It does fit into one repo, but that doesn't meet our needs for the
>> future. Users want to install and build library X and its dependencies,
>> not all of boost. This is increasingly becoming a problem as boost
>> grows. Imagine if a perl programmer had to download all of CPAN to use
>> or hack on any one perl module. Or if contributing to CPAN meant getting
>> the whole shebang, history and all. I'm sure even in the Linux kernel,
>> not *every* third-party driver is maintained in the master git repo.
> 
> Actually, that's mostly not true; there are a few third-party drivers
> that don't make it into the core Linux repo
<snip discussion showing my ignorance of Linux's repository structure>

Thanks for the correction. The CPAN/PyPi analogy is still apt.

>> We are aiming to make boost a clearing-house for C++ libraries (like
>> CPAN, or PyPi for python), turning the official boost distribution into
>> little more than a well-tested collection of the libraries that have
>> passed our peer-review and regression test process.
> 
> Of course you will want to have some kind of really excellent
> versioned dependency fetching system (exactly like CPAN or PyPi or
> ruby gems) if you want this to be nice.  git's submodules stuff is
> almost certainly not going to add any features you need/want.  On the
> other hand, cloning a separate git repo is pretty easy to write your
> CPAN-like script around.

Indeed, we are stealing the work of the python guys. Pip does most of
what we want. They've graciously been accepting our patches so it
happily clones git repos in order to satisfy dependencies now. It is
some kind of really excellent! :-)

>> In fact, the modularization has already been done, and work is well
>> underway on the infrastructure to support dependency tracking. But the
>> modularization is not history-preserving and needs to be redone.
> 
> If your code doesn't move too many files around, then splitting out
> the history is pretty easy with git-subtree (a tool I wrote that's not
> part of git):
> 
>    git subtree split --prefix=/path/to/subdir
> 
> And you get a new history for just that subdir.  That might do exactly
> what you want.  It also works iteratively, so you can export your
> history from svn, then re-export the changes as they occur over time.

This looks like it here:

  http://github.com/apenwarr/git-subtree

I'll have to read the docs. Thanks for the tip.

>>>> So,, what are the options? Can I somehow delete from each repository the
>>>> history that is irrelevant? Is these some feature of git I don't know
>>>> about that can solve this problem for us?
>>>
>>> How do you define "irrelevant"? Do you only require enough history for
>>> git annotate/blame to give correct results?  Or does this only refer
>>> to multiple repositories sharing the same ancient history?
>>
>> If multiple repositories share the same ancient history, wouldn't that
>> give git annotate/blame enough information? Sorry, git newbie here.
> 
> Yes, it would.  But how much of the ancient history do you want?  If
> you want all of it, you don't save any space in your repo.

Repos, plural. We'd save space because the history wouldn't be
duplicated in each one. Right? Or else I'm confused and this something
that will become clear after I understand what git subtree does.

Right now, the other boost developers are pushing for a solution that
uses grafts. I'm fuzzy on what they are exactly, but it seems that we'd
freeze a svn mirror and have anybody interested in history put grafts in
their local repository pointing back at the mirror. I don't know enough
yet to say what the pros/cons of this approach might be wrt git subtree.

>> The plan is to move to git. However, we don't expect this to happen
>> overnight, so a way to continue to pull changes from a svn mirror while
>> the new git repositories are being set up would be ideal.
> 
> This isn't too hard to do; you just need some scripts around git-svn
> and git-subtree (or whatever tool you use to do the splitting).  We've
> done this at work for a couple of years now and it's working fine.

Cool.

> The confusing part is taking *submissions* back through both channels.
> If you value your sanity, you probably want to only allow submissions
> back via svn while you're running the two in parallel; but that makes
> git's added features a lot less useful, so you probably want to run in
> parallel for only a short time.

Oh my! I don't think we'd open the git repositories for changes until
after we close down svn. This problem is hard enough.

-- 
Eric Niebler
BoostPro Computing
http://www.boostpro.com
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]