Re: [PATCH] [RFC] list-objects-filter: introduce new filter sparse:buffer=<spec>

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Jeff King <peff@xxxxxxxx> 于2022年8月11日周四 05:15写道:
>
> On Tue, Aug 09, 2022 at 09:37:09AM -0400, Derrick Stolee wrote:
>
> > > Was the reason why we have "we limit to an object we already have"
> > > restriction because we didn't want to blindly use a piece of
> > > uncontrolled arbigrary end-user data here?  Just wondering.
> >
> > One of the ideas here was to limit the opportunity of sending an
> > arbitrary set of data over the Git protocol and avoid exactly the
> > scenario you mention.
>
> One other implication here is that the filter spec is sent inside of a
> pkt-line.  So the implementation here is limiting us to 64kb. That may
> sound like a lot for simple specs, but I imagine in big repos they can
> possibly get pretty complex.
>
> That would be fixable with a protocol extension to take the data over
> multiple pkt-lines.
>

This sounds very scary, a filter rules file has (more then) 64KB...
If the filter is really big, I think the server will really be slow to parse it.

> That said...
>
> > At this moment, I think path-scoped filters have a lot of problems
> > that need solving before they can be used effectively in the wild.
> > I would prefer that we solve those problems before making the
> > feature more complicated. That's a tall ask, since these problems
> > do not have simple solutions.
>
> ...I agree with this. It is nice to put more power in the hands of the
> clients, but we have to balance that with other issues like server
> resource use. The approach so far has been to implement the simplest and
> most efficient operations at the client-server level, and then have the
> client build local features on top of that. So in this case, probably
> requesting that _no_ trees are sent in the initial clone, and then
> faulting them in as the client explores the tree using its own local
> sparse definition. And I think that mostly works now.
>

Agree. But we have to fetch these blobs one by one after partial clone,
why not reduce some extra network overhead If we can get those blobs
that are *most* needed in the first partial clone, right?

> Though I admit I do not keep a close watch on the status of
> partial-checkout features. I mostly always cared about it from the
> server provider angle. ;)
>
> -Peff

Thanks.

ZheNing Hu




[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux