On Tue, Oct 24, 2017 at 11:53 AM, Jeff Hostetler <git@xxxxxxxxxxxxxxxxx> wrote: > From: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > > I've been working with Jonathan Tan to combine our partial clone > proposals. This patch series represents a first step in that effort > and introduces an object filtering mechanism to select unwanted > objects. > > [1] traverse_commit_list and list-objects is extended to allow > various filters. > [2] rev-list is extended to expose filtering. This allows testing > of the filtering options. And can be used later to predict > missing objects before commands like checkout or merge. > [3] pack-objects is extended to use filtering parameters and build > packfiles that omit unwanted objects. > > This patch series lays the ground work for subsequent parts which > will extend clone, fetch, fetch-pack, upload-pack, fsck, and etc. Thanks - I've taken a look at all of them except for the partialclone extension one, which I've only glanced over briefly. Apart from some style issues (indentation and typedef-ing of enums) I think that they generally look all right. One possible issue with using a formatted filter string (e.g. blob:none) directly passed to the server as opposed to actual client-interpreted flags (e.g. --blob-byte-limit=0) is that a user may be confused if the version of Git they're using supports fancier filters, which will work if they're running rev-list locally, but not when they're fetching from a not-so-fancy Git server. Maybe that is fine, though - something we've discussed internally is an ability to offload some calculations (e.g. git log -S) to Git servers, and if we end up doing something similar to that, users will need to deal with this problem anyway. The reason why I only glanced over the partialclone patch is because I think that that change needs more discussion than the rest, and it would be good to get the others in first. I know that you included the partialclone patch because it is needed by the rev-list one, but as I commented in the rev-list one, I think that it does not need to be aware of partial clones, at least for now. The overall ideas in the partialclone patch seem good, though - in particular, one conceptual difference from my patch [1] is that the filter setting is a property of the repository as opposed to the remote, which does seem like an improvement, because it makes it easier to make and explain e.g. a "git log --use-repo-filter -S" command that uses the preconfigured config. [1] https://public-inbox.org/git/407a298b52a9e0a2ee4135fe844e35b9a14c0f7b.1506714999.git.jonathantanmy@xxxxxxxxxx/