Derrick Stolee <derrickstolee@xxxxxxxxxx> 于2022年8月9日周二 21:37写道: > > On 8/8/2022 12:15 PM, Junio C Hamano wrote: > > "ZheNing Hu via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > > > >> From: ZheNing Hu <adlternative@xxxxxxxxx> > >> > >> Although we already had a `--filter=sparse:oid=<oid>` which > >> can used to clone a repository with limited objects which meet > >> filter rules in the file corresponding to the <oid> on the git > >> server. But it can only read filter rules which have been record > >> in the git server before. > > > > Was the reason why we have "we limit to an object we already have" > > restriction because we didn't want to blindly use a piece of > > uncontrolled arbigrary end-user data here? Just wondering. > > One of the ideas here was to limit the opportunity of sending an > arbitrary set of data over the Git protocol and avoid exactly the > scenario you mention. > > Another was that it is incredibly expensive to compute the set of > reachable objects within an arbitrary sparse-checkout definition, > since it requires walking trees (bitmaps do not help here). This > is why (to my knowledge) no Git hosting service currently supports > this mechanism at scale. At minimum, using the stored OID would > allow the host to keep track of these pre-defined sets and do some > precomputing of reachable data using bitmaps to keep clones and > fetches reasonable at all. > How about only allowing some easier filter rules? e.g. https://github.com/derrickstolee/sparse-checkout-example User A can use --filter="sparse:buffer=client" to download client/ directory, User B can use --filter="sparse:buffer=service/list" to download only service/list. cat >filterspec <<-EOF && web service EOF User C can use --filter="sparse:buffer=`cat filterspec`" to download web/ and service/. cat >filterspec <<-EOF && service !service/list EOF But user D cannot use --filter="sparse:buffer=service/list" to download service without service/list. I guess many users can benefit from this... > The other side of the issue is that we do not have a good solution > for resolving how to change this filter in the future, in case the > user wants to expand their sparse-checkout definition and update > their partial clone filter. > I guess we don't really need to maintain this "partial clone filter", we can even reuse sparse-checkout rules after we first partial-clone, we maybe should write the first partial-clone filter rules to .git/info/sparse-checkout (only when --sparse is used in git clone?) > There used to be a significant issue where a 'git checkout' > would fault in a lot of missing trees because the index needed to > reference the files outside of the sparse-checkout definition. Now > that the sparse index exists, this is less of an impediment, but > it can still cause some pain. > Agree. > At this moment, I think path-scoped filters have a lot of problems > that need solving before they can be used effectively in the wild. > I would prefer that we solve those problems before making the > feature more complicated. That's a tall ask, since these problems > do not have simple solutions. > Could you tell me where the problem is? I can start to deal with them :) > Thanks, > -Stolee Thanks. ZheNing Hu