Re: [External Mail]Re: Partial-clone cause big performance impact on server

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Aug 17, 2022 at 09:41:10AM -0400, Derrick Stolee wrote:

> On 8/17/2022 6:22 AM, 程洋 wrote:
> > But I still think the protocol still should tell the server which ref
> > the blob is reachable.
> > Because it would be really hard to implement any kind of ACL
> 
> I think this idea has merit on its face, but it wouldn't really solve the
> problem since the reachability query would still need to be done, just
> from a smaller set of references at first. If we were able to say "this
> blob can be found at path X at commit Y" then the server could do a
> commit-reachability query and a path traversal, which should be a lot
> faster.
> 
> However, it would be extremely difficult to plumb into the partial clone
> machinery. At the point where Git realizes it is missing a promisor
> object, that code is very generic and removed from any kind of walk from a
> reference. That is further complicated by the fact that the walk is
> probably from a local reference, which can be entirely different from the
> remote reference.

Agreed. The client often doesn't know the context of what it's asking
for in the first place. Sometimes it's not carried through the code, but
we also have commands that might not be invoked with a commit in the
first place! It's valid to run "git read-tree <tree>", and we should be
able to fault in blobs from that tree as needed.

I also think that this kind of "is the blob reachable" query is
mostly expensive if you don't have reachability bitmaps at all. If you
do, then the cost to ask "is this object reachable" is the same for a
commit or a blob. If the server has a bitmap of all objects reachable
for each branch ACL (even if it has to do some small bit of fill-in
walking to bring it up to date), then querying for any object type the
client asks for is still just a bit lookup.

Not knowing a lot about gerrit or jgit, it's not clear to me if there
are configuration knobs that could be tweaked on the server side to make
these requests more efficient.

> One possible hurdle is the fact that this branch-level security is a
> feature of Gerrit, not a feature of Git itself. Optimizing Git to that
> special case that Git does not itself support is less valuable to the Git
> project itself.

We don't have branch-level security per se, but I do think that
everything is there in Git to do fast "is this object reachable from
these branches" queries. If you're making a lot of those queries it
might influence your decision of which bitmaps to generate, but the
bitmap concept itself should be sufficient.

-Peff



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux