Re: Partial-clone cause big performance impact on server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



程洋 <chengyang@xxxxxxxxxx> writes:
>     3. with GIT_TRACE_PACKET=1. We found on big repositories (200K+refs, 6m+ objects). Git will sends 40k want.
>     4. And we then track our server(which is gerrit with jgit). We found the server is couting objects. Then we check those 40k objects, most of them are blobs rather than commit. (which means they're not in bitmap)
>     5. We believe that's the root cause of our problem. Git sends too many "want SHA1" which are not in bitmap, cause the server to count objects  frequently, which then slow down the server.
> 
> What we want is, download the things we need to checkout to specific commit. But if one commit contain so many objects (like us , 40k+). It takes more time to counting than downloading.
> Is it possible to let git only send "commit want" rather than all the objects SHA1 one by one?

On a technical level, it may be possible - at the point in the Git code
where the batch prefetch occurs, I'm not sure if we have the commit, but
we could plumb the commit information there. (We have the tree, but this
doesn't help us here because as far as I know, the tree won't be in the
bitmap so the server would need to count objects anyway, resulting in
the same problem.)

However, sending only commits as wants would mean that we would be
fetching more blobs than needed. For example, if we were to clone (with
checkout) and then checkout HEAD^, sending a "commit want" for the
latter checkout would result in all blobs referenced by the commit's
tree being fetched and not only the blobs that are different.

One idea that we (at $DAYJOB) had is to supply a commit hint so that the
server can first use bitmaps to narrow down the objects that need to be
checked. I had a preliminary patch for that [1] but as of now, no one
has continued pursuing that idea.

[1] https://lore.kernel.org/git/20201215200207.1083655-1-jonathantanmy@xxxxxxxxxx/




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux