Re: Is --filter-print-omitted correct/used/needed?

Emily Shaffer <emilyshaffer@xxxxxxxxxx> · Fri, 7 Jun 2019 14:10:01 -0700

On Fri, Jun 07, 2019 at 02:57:01PM -0400, Jeff Hostetler wrote:
> 
> 
> On 6/7/2019 2:38 AM, Christian Couder wrote:
> > On Thu, Jun 6, 2019 at 10:18 PM Emily Shaffer <emilyshaffer@xxxxxxxxxx> wrote:
> > 
> > 
> > 
> > > I grepped the Git source and found that we only provide a non-NULL
> > > "omitted" when someone calls "git rev-list --filter-print-omitted",
> > > which we verify with a simple test case for "blobs:none", in which
> > > case the "border" objects which were omitted must be the same as all
> > > objects which were omitted (since blobs aren't pointing to anything
> > > else). I think if we had written a similar test case with some trees
> > > we expect to omit we might have noticed sooner.
> > 
> > It seems that --filter-print-omitted was introduced in caf3827e2f
> > (rev-list: add list-objects filtering support, 2017-11-21) so I cc'ed
> > Jeff.
> > 
> > [...]
> 
> The --filter-print-omitted was intended to print the complete list
> of omitted objects.  For example, a packfile built from a filtered
> command and a packfile build from the unfiltered command would differ
> by exactly that set of objects.
> 
> So the discrepancy reported by the tree:1 example is incorrect.
> The omitted set is the full set, not the frontier.  So when
> --filter-print-omitted is used, we still have to do the full tree walk.
> When not specified, we do get the perf boost because we can terminate
> the tree walk early.
> 
> 
> > > So, what do we use --filter-print-omitted for? Is anybody needing it?
> > > Or do we just use it to verify this one test case? Should we fix it,
> > > or get rid of it, or neither?
> > 
> > In caf3827e2f there is:
> > 
> >      This patch introduces handling of missing objects to help
> >      debugging and development of the "partial clone" mechanism,
> >      and once the mechanism is implemented, for a power user to
> >      perform operations that are missing-object aware without
> >      incurring the cost of checking if a missing link is expected.
> > 
> > So I would say that if you think that --filter-print-omitted doesn't
> > help in debugging or development, and can even be confusing, and that
> > it also doesn't help performance for power users or anyone else, then
> > it would make sense to remove it, unless you find a way to make it
> > fulfill its original goals, or maybe other worthwhile goals.
> 
> I don't currently have a use for that (other than the existing test
> cases), but we could use that in the future as a guide for the server
> to put the omitted objects on a CDN, for example.
> 
> So I'd say let's leave it as is for now.

Thanks for the input, Jeff. I wasn't sure from looking at it whether it
was intended behavior with a plan for use; looks like it is. I'll leave
it alone.

> 
> 
> Jeff
> 
> 
>