Brandon Casey <casey@xxxxxxxxxxxxxxx> wrote: > > I think the goal of these three objects is space savings (and speed), > but I don't understand when I should prefer one option over another, or > when/whether to use a combination of them. And I am unsure (SCARED) > about any side effects they may have. Yes, they are mainly about saving time setting up the new clone, and about disk space required by the new clone. > 1) What does local mean? > --local says repository must be on the "local" machine and claims it > attempts to make hardlinks when possible. Of course hard links cannot > be created across filesystems, so are there other speedups/space > savings when repository is on local machine but not on the same > filesystem? Is this option still valid then? Basically --local means instead of using the native Git transport to copy object data from one repository to another we shortcut and use `find . | cpio -lpumd` or somesuch, so that cpio can use hardlinks if possible (same filesystem) but fallback to whole copy if it cannot. This is usually faster than the native Git transport as we copy every file, without first trying to compute if the file would be needed by the new clone or not. So --local may copy garbage that git-prune would have removed, or that git-repack/git-gc might have eliminated from a packfile. But generally that's such a small amount of data that the faster cpio path (and even better, the hardlinks) saves disk. Note we only hardlink the immutable data under .git/objects; the mutable data and the working directory files that are checked out are *not* hardlinked. > 2) Does --shared imply shared write access? Does --local? > I'll point out that git-init has an option with the same name. No. --shared means something entirely different in git-clone than it does in git-init. The --shared here implies adds the source repository to the new repository's .git/objects/info/alternates. This means that the new clone doesn't copy the object database; instead it just accesses the source repository when it needs data. This exposes two risks: a) Don't delete the source repository. If you delete the source repository then the clone repository is "corrupt" as it won't be able to access object data. b) Don't repack the source repository without accounting for the refs and reflogs of all --shared repositories that came from it. Otherwise you may delete objects that the source repository no longer needs, but that one or more of the --shared repositories still needs. Objects that are newly created in a --shared repository are written in the --shared area, not in the source repository. Hence the source repository can be read-only to the current user. > 3) --shared seems like a special case of --reference? Are there > differences? --reference is actually a special case of --shared. --reference is meant for cloning a remote repository over the network, where you already have an existing local repository that has most of the objects you need to successfully clone the remote repository. With --reference we setup a temporary copy of refs from the --reference repository in the new repository, so that during the network transfer from the remote system we don't download things the --reference repository already has. But --reference implies --shared, and has the same issues as above. > 4) what happens if the source repository dissappears? Is --local ok > but --shared screwed? Correct. > 4) is space savings obtained only at initial clone? or is it on going? > does a future git pull from the source repository create new hard > links where possible? Only on initial clone. Later pulls will copy. You can try using git-relink to redo the hardlinks after the pull. > Can --shared be used with --reference. Can --reference be used multiple > times (and would I want to). Does -l with -s get you anything? (the > examples use this) --reference can only be given once in a git-clone; we only setup one set of temporary references during the network transfer. And as I said above, --reference implies --shared. -- Shawn. - To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html