Re: git-clone --single-branch clones objects outside of branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sun, Jan 26, 2020 at 9:55 PM Jeff King <peff@xxxxxxxx> wrote:
> On Sun, Jan 26, 2020 at 04:39:52AM -0800, Chris Jerdonek wrote:
> > However, when I attempted this with a local repo, I found that objects
> > located only in branches other than the branch I specified are also
> > cloned. Also, this is true even if the remote repo has only loose
> > objects (i.e. no pack files). So it doesn't appear to be doing this
> > only to avoid creating new files.
>
> This is the expected outcome, because in your example you're cloning on
> the local filesystem. By default that enables some optimizations, one of
> which is to hard-link the object files into the destination repository.
> That avoids the cost of copying and re-hashing them (which a normal
> cross-system clone would do). And it even avoids traversing the objects
> to find which are necessary, instead just hard-linking everything.

Thanks for the reply. It's okay for that to be the expected behavior.
My suggestion would just be that the documentation for --single-branch
be updated to clarify that objects unreachable from the specified
branch can still be in the cloned repo when run using the --local
optimizations. For example, it can matter for security if one is
trying to create a clone of a repo that doesn't include data from
branches with sensitive info (e.g. in following Git's advice to create
a separate repo if security of private data is desired:
https://git-scm.com/docs/gitnamespaces#_security ).

I'm guessing other flags also don't apply when --local is being used.
For example, I'm guessing --reference is also ignored when using
--local, but I haven't checked yet to confirm. It would be nice if the
documentation gave a heads up in cases like these. Even if hard links
are being used, it's not clear from the docs whether the objects are
filtered first, prior to hard linking, when flags like --single-branch
and --reference are passed.

> This one behaves as you expected because git-fetch does not perform the
> same optimizations (it wouldn't make as much sense there, as generally
> in a fetch we already have most of the objects from the other side
> anyway, so hard-linking would just give us duplicates).

Incidentally, here's a thread from 2010 requesting that this
optimization be available in the git-fetch case:
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=573909
(I don't know how reports on that Debian list relate to this list.)

--Chris



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux