Hi, Jeff Hostetler wrote: > Teach rev-list to use the filtering provided by the > traverse_commit_list_filtered() interface to omit > unwanted objects from the result. > > Object filtering is only allowed when one of the "--objects*" > options are used. micronit: the line widths seem to be uneven in these commit messages, which is a bit distracting when reading. > When the "--filter-print-omitted" option is used, the omitted > objects are printed at the end. These are marked with a "~". > This option can be combined with "--quiet" to get a list of > just the omitted objects. Neat. Can you give a quick example? Using --quiet for this feels a bit odd, since it previously meant to print nothing to stdout. I wonder if there's another way --- e.g. --print-omitted=(yes|no|only) If I wanted to list all objects matching a filter, even objects that are not reachable from any ref, is there a way to do that? (Just curious, trying to think about this interface.) > Add t6112 test. This part doesn't need to be in the commit message. More generally, anything I could more easily learn from the code or diffstat doesn't need to be in the commit message: the commit message is about the "why" more than the details of what in the code changed. > In the future, we will introduce a "partial clone" mechanism > wherein an object in a repo, obtained from a remote, may > reference a missing object that can be dynamically fetched from > that remote once needed. This "partial clone" mechanism will > have a way, sometimes slow, of determining if a missing link > is one of the links expected to be produced by this mechanism. Does this mean the <filter-spec>s will be part of the wire protocol? I'll look more carefully at them below with that in mind. > This patch introduces handling of missing objects to help > debugging and development of the "partial clone" mechanism, > and once the mechanism is implemented, for a power user to > perform operations that are missing-object aware without > incurring the cost of checking if a missing link is expected. I had trouble understanding what this paragraph is about. Can you give an example? > Signed-off-by: Jeff Hostetler <jeffhost@xxxxxxxxxxxxx> > --- > Documentation/git-rev-list.txt | 4 +- > Documentation/rev-list-options.txt | 36 ++++++ > builtin/rev-list.c | 108 ++++++++++++++++- > t/t6112-rev-list-filters-objects.sh | 225 ++++++++++++++++++++++++++++++++++++ > 4 files changed, 370 insertions(+), 3 deletions(-) > create mode 100755 t/t6112-rev-list-filters-objects.sh Looks reasonably concise, good. [...] > --- a/Documentation/git-rev-list.txt > +++ b/Documentation/git-rev-list.txt > @@ -47,7 +47,9 @@ SYNOPSIS > [ --fixed-strings | -F ] > [ --date=<format>] > [ [ --objects | --objects-edge | --objects-edge-aggressive ] > - [ --unpacked ] ] > + [ --unpacked ] > + [ --filter=<filter-spec> [ --filter-print-omitted ] ] ] Does this mean --filter is only useful with --objects? E.g. I can't use it to filter commits? > + [ --missing=<missing-action> ] --missing=(error|allow-any|print) would be more informative and about equally concise. Since this is mainly for debugging, does it have a different compatibility guarantee from other options? Could it be named accordingly to set expectations? [...] > +The form '--filter=blob:none' omits all blobs. Sounds sensible. > ++ > +The form '--filter=blob:limit=<n>[kmg]' omits blobs larger than n bytes > +or units. The value may be zero. On second thought, doesn't blob:limit=0 mean blob:none is not needed? Is it for future consistency with tree:none? What units do [kmg] use? Are they GB, GiB, or one of the variants in between? > ++ > +The form '--filter=sparse:oid=<oid-ish>' uses a sparse-checkout > +specification contained in the object (or the object that the expression > +evaluates to) to omit blobs that would not be not required for a > +sparse checkout on the requested refs. This one makes me a little nervous because it would mean we're planning on adding sparse-checkout specifications to the wire protocol. Maybe that's okay --- they're already part of the on-disk format --- but it makes me nervous because the sparse-checkout format is not so great, as I believe MS has already noticed. What is an <oid-ish>? Can it just say <blob>? How would this one work when passed over the wire? > ++ > +The form '--filter=sparse:path=<path>' similarly uses a sparse-checkout > +specification contained in <path>. Is this <path> relative to the cwd of the caller, or is it within some commit? Sorry it took so long to send this feedback / these questions. Hopefully it's useful nevertheless. Thanks and hope that helps, Jonathan