Re: Slow pushes on 'pu' - even when up-to-date..

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Oct 04, 2016 at 01:18:45PM +0200, Heiko Voigt wrote:

> On Mon, Oct 03, 2016 at 02:11:36PM -0700, Linus Torvalds wrote:
> > This seems to be because I'm now on 'pu' as of a day or two ago in
> > order to test the abbrev logic, but lookie here:
> > 
> >     time git ls-remote ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux
> >     .. shows all the branches and tags ..
> >     real 0m0.655s
> >     user 0m0.011s
> >     sys 0m0.004s
> > 
> > so the remote is fast to connect to, and with network connection
> > overhead and everything, it's just over half a second. But then:
> > 
> >     time git push ra.kernel.org:/pub/scm/linux/kernel/git/torvalds/linux
> 
> The reason behind this is when pushing to an address we do not easily
> have the remote refs to compare available. When pushing an existing ref
> it would be easy and could get a shortcut but it gets more complicated
> for new refs. Currently we fall back to walking the whole history since
> that is "the most correct way" we have. But obviously it is not a
> practical solution in any way.
> 
> I mentioned this fact when discussing the current state and my patches
> to make this check less painful. So we still need to think about a
> solution for this check when passing an address.
> 
> IMO: It's definitely not ready to be switched on as default, unless we
> find something a lot cheaper for the above case.
> 
> My idea of a solution goes like this:
>   * collect all SHA1's of the remotes refs
>   * check if we have them locally
>   * if not we abort and tell the user to fetch them somehow into local
>     refs or disable the check
>   * when we have them locally we proceed passing those SHA1's as bases
>     instead of --remotes=<name>

As I argued in [1], I think it's not just "this must be cheaper" but
"this must not be enabled if submodules are not in use at all".  Most
repositories don't have submodules enabled at all, so anything that
cause any extra traversal, even of a portion of the history, is going to
be a net negative for a lot of people.

I think the only sane default is going to be some kind of heuristic that
says "submodules are probably in use". Something like "is there a
.gitmodules file" is not perfect (you can have gitlink entries without
it), but it's a really cheap constant-time check.

-Peff

[1] Quoted in
    http://public-inbox.org/git/xmqqh9aaot49.fsf@xxxxxxxxxxxxxxxxxxxxxxxxxxx/



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]