On Thu, 4 Feb 2010, demerphq wrote: > At $work we have a host where we have about 50-100 users each with > their own private copies of the same repos. These are cloned froma > remote via git/ssh and are not thus automatically hardlinking their > object stores. > > This is starting to take a lot of space. You should keep a pristine copy of that common repository on that host and make it readable to everyone, and then ask your users to use the --reference argument with 'git clone' to borrow as much as possible from that common repository. For those who already cloned the repository in full i.e. without the --reference switch, then it is possible to fix the situation simply by adding the full path to the common repository's .git/objects directory in their own .git/objects/info/alternates (create it if it doesn't exist) and then run 'git gc'. That's what the --reference argument to the clone command does: setting up that .git/objects/info/alternates file. > I was thinking it should be possible to hardlink all of the objects in > the different repos to a canonical single copy. > > Would i be correct in thinking that if i have to repos with an > equivalent .git/objects/../..... file in them that the files are > necessarily identical and one can be replaced by a hardlink to the > other? Yes, you could do that. However you'll save very little by doing that as the bulk of a repository content is normally stored into pack files, and those may differ from one repository to another depending on what exactly the pack contains. The alternates mechanism is more powerful as it lets Git fetch objects from the canonical repository packed or not, and more importantly it avoids creating local copy of new objects if they already exists in that canonical copy meaning that you don't have to constantly search in every user's repository for potential new objects to hardlink. > If this is correct then is there some tool known to the list that > already does this? I whipped this together: The "tool" exists in Git already and is what I describe above. The actual tool you might need is probably a script to populate that .git/objects/info/alternates file in all your users' repositoryes and maybe run ,git gc' on their behalf. Nicolas -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html