RE: [External Mail]Re: why git is so slow for a tiny git push?

程洋 <chengyang@xxxxxxxxxx> · Thu, 25 Nov 2021 02:53:33 +0000

Well, we do have 300k refs, but only 1000 refs/heads.
However, I think most users only requires refs/heads, a few people only require refs/tags. As for other refs, we hardly see any user case.

So jgit treat it with a smart way,  it will create 2 pack files and 2 bitmaps, pack A contain all refs/heads, and pack B contains other refs. And when user do a fresh clone, it just need to send the pack A without determine if we can reuse or not

-----Original Message-----
From: Jeff King <peff@xxxxxxxx>
Sent: Thursday, November 25, 2021 2:15 AM
To: 程洋 <chengyang@xxxxxxxxxx>
Cc: Taylor Blau <me@xxxxxxxxxxxx>; Ævar Arnfjörð Bjarmason <avarab@xxxxxxxxx>; git@xxxxxxxxxxxxxxx
Subject: Re: [External Mail]Re: why git is so slow for a tiny git push?

*This message originated from outside of XIAOMI. Please treat this email with caution*

On Tue, Nov 23, 2021 at 06:42:12AM +0000, 程洋 wrote:

> I got another problem here.
> When I tries to clone from remote server. It took me 25 seconds to enumerating objects. And then 1 second to `couting objects` by bitmap.
> I don't understand, why a fresh clone need `enumerating objects` ? Is `couting objects` enough for the server to determine what to send?

In older versions of Git, the "counting objects" progress meter used to be the actual object graph traversal. That changed in v2.18 (via 5af050437a), but you may still seem some reference to "counting objects is expensive".

These days that is called "enumerating objects", and "counting objects"
is just doing a quick-ish pass over that list to do some light analysis (e.g., if we can reuse an on-disk delta). I'd expect "enumerating" to be expensive in general, and "counting" to be quick in general.

The "enumerating" phase is where we determine what to send whether it's for a clone or a fetch, and may involve opening up a bunch of trees to walk the graph. It's what reachability bitmaps are supposed to make faster. But if you have 300k refs, as you've mentioned, you almost certainly don't have complete coverage of all of the ref tips, so we'll have to fallback to doing at least a partial graph traversal.

Taylor (cc'd) has been looking at some tricks for speeding up cases like this with a lot of refs. But I don't think there's anything to show publicly yet.

-Peff
#/******本邮件及其附件含有小米公司的保密信息，仅限于发送给上面地址中列出的个人或群组。禁止任何其他人以任何形式使用（包括但不限于全部或部分地泄露、复制、或散发）本邮件中的信息。如果您错收了本邮件，请您立即电话或邮件通知发件人并删除本邮件！ This e-mail and its attachments contain confidential information from XIAOMI, which is intended only for the person or entity whose address is listed above. Any use of the information contained herein in any way (including, but not limited to, total or partial disclosure, reproduction, or dissemination) by persons other than the intended recipient(s) is prohibited. If you receive this e-mail in error, please notify the sender by phone or email immediately and delete it!******/#