Re: blobs (once more)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On Wed, 6 Apr 2011, Pau Garcia i Quiles wrote:

> Binary large objects. I know it has been discussed once and again but 
> I'd like to know if there is something new.
> 
> Some corporation hired the company I work for one year ago to develop a 
> large application. They imposed ClearCase as the VCS. I don't know if 
> you have used it but it is a pain in the ass. We have lost weeks of 
> development to site-replication problems, funny merges, etc. We are 
> trying to migrate our project to git, which we have experience with.
> 
> One very important point in this project (which is Windows only) is 
> putting binaries in the repository. So far, we have suceeded in not 
> doing that in other projects but we will need to do that in this 
> project.
> 
> In the Windows world, it is not unusual to use third-party libraries 
> which are only available in binary form. Getting them as source is not 
> an option because the companies developing them are not selling the 
> source. Moving from those binary-only dependencies to something else is 
> not an option either because what we are using has some unique features, 
> be it technical features or support features. In our project, we have 
> about a dozen such binaries, ranging from a few hundred kilobytes, to a 
> couple hundred megabytes (proprietary database and virtualization 
> engine).
> 
> The usual answer to the "I need to put binaries in the repository" 
> question has been "no, you do not". Well, we do. We are in heavy 
> development now, therefore today's version may depend on a certain 
> version of a third-party shared library (DLL) which we only can get in 
> binary form, and tomorrow's version may depend on the next version of 
> that library, and you cannot mix today's source with yesterday's 
> third-party DLL. I. e. to be able to use the code from 7 days ago at 
> 11.07 AM you need "git checkout" to "return" our source AND the binaries 
> we were using back then. This is something ClearCase manages 
> satisfactorily.

I understand. The problem in your case might not be too bad, after all. 
The problem only arises when you have big files that are compressed. If 
you check in multiple versions of an uncompressed .dll file, Git will 
usually do a very good job at compressing them.

If they are compressed, what you probably need is something like a sparse 
clone, which is sort of available in the form of shallow clones, but it is 
too limited still.

Having said that, in another company I work for, they hav 20G repositories 
and they will grow larger. That is something they incurred due to 
historical reasons, and they are willing to pay the price in terms of disk 
space. Due to Git's distributed nature, they had no problems with cloning; 
they just use a local reference upon initial clone.

> I have read about:
> - submodules + using different repositories once one "blob repository"  
>   grows too much. This will be probably rejected because it is quite 
>   contrived.

I would also recommend against this, because submodules are a very weak 
part of Git.

> - git-annex (does not get the files in when cloning, pulling, checking 
>   out; you need to do it manually)
> - git-media (same as git-annex)

Yes, this is an option, but a bit klunky.

> - boar (no, we do not want to use a VCS for binaries in addition to git)

I did not know about that.

> - and a few more
> 
> So far the only good solution seems to be git-bigfiles but it's still
> in development.

It has stalled, apparently, but I wanted to have a look at it anyway. Will 
let you know of my findings!

> Is there any good solution for my use case, where version = sources 
> version + binaries version?
> 
> Thank you.
> 
> If we suceed with git here, the whole corportation (150,000+
> employees, Fortune 500) may start to move to git in a year. Many
> people are fed up with CC there.

Ciao,
Johannes

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]