Junio C Hamano <gitster@xxxxxxxxx> writes: > Christian Couder <christian.couder@xxxxxxxxx> writes: > >> In some discussions, it was mentioned that such a feature, or a >> similar feature in `git gc`, or in a new standalone command (perhaps >> called `git prune-filtered`), should put the filtered out objects into >> a new packfile instead of deleting them. >> >> Recently there were internal discussions at GitLab about either moving >> blobs from inactive repos onto cheaper storage, or moving large blobs >> onto cheaper storage. This lead us to rethink at repacking using a >> filter, but moving the filtered out objects into a separate packfile >> instead of deleting them. >> >> So here is a new patch series doing that while implementing the >> `--filter=<filter-spec>` option in `git repack`. > > Very interesting idea, indeed, and would be very useful. > Thanks. Overall, I have a split feeling on the series. One side of my brain thinks that the series does a very good job to address the needs of those who want to partition their objects into two classes, and the problem I saw in the series was mostly the way it was sold (in other words, if it did not mention unbloating lazily cloned repositories at all, I would have said "Yes! It is an excellent series.", and if it said "this mechanism is not meant to be used to unbloat a lazily cloned repository, because the mechanism does not distinguish objects that are only locally available and objects that are retrievable from the promisor remotes, among those that match the filter", it would have been even better) To the other side of my brain, it smells as if the series wanted to address the unbloating issue, but ended up with an unsatisfactory solution, and used "partitioning objects in a full repository on the server side " as an excuse for the resulting mechanism to still exist, even though it is not usable for the original purpose. Ideally, it would be great to have a mechanism that can be used for both. The "partitioning" can be treated as a degenerate case where the repository does not have its upstream promisor (hence, any object that match the filtering criteria can be excluded from the primary pack because there are no "not available (yet) in our promisor" objects), while the "unbloat" case can know who its promisors are and ask the promisors what objects, among those that match the filtering criteria, are still available from them to exclude only those objects from the primary pack. In the second ideal world, we may not be ready to tackle the unbloating issue, but "partitioning" alone may still be a useful feature. In that case, perhaps the series can be salvaged by updating how the feature is sold, with some comments indicating the future direction to extend the mechanism later. Thanks.