Re: [RFC PATCH 3/3] list-objects-filter: implement composite filters

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Matthew DeVore <matvore@xxxxxxxxxx> writes:

> Allow combining filters such that only objects accepted by all filters
> are shown. The motivation for this is to allow getting directory
> listings without also fetching blobs. This can be done by combining
> blob:none with tree:<depth>. There are massive repositories that have
> larger-than-expected trees - even if you include only a single commit.
>
> The current usage requires passing the filter to rev-list, or sending
> it over the wire, as:
>
> 	combine:<FILTER1>+<FILTER2>
>
> (i.e.: git rev-list --filter=combine:tree:2+blob:limit=32k). This is
> potentially awkward because individual filters must be URL-encoded if
> they contain + or %. This can potentially be improved by supporting a
> repeated flag syntax, e.g.:
>
> 	$ git rev-list --filter=tree:2 --filter:blob:limit=32k

Shouldn't the second one say "--filter=blob:limit=32k" (i.e. the
first colon should be an equal sign)?

> Such usage is currently an error, so giving it a meaning is backwards-
> compatible.

Two minor comments.  

If combine means "must satisfy all of these", '+' is probably a poor
choice (perhaps we want '&' instead).  Also, it seems to me that
having to worry about url encoding and parsing encoded data
correctly and securely would be far more work than simply taking
multiple command line parameters, accumulating them in a string
list, and then at the end of command line parsing, building a
combined filter out of all of them at once (a degenerate case may
end up attempting to build a combined filter that combines a single
filter), iow just biting the bullet and do the "potentially be
improved" step from the beginning.

> +The form '--filter=combine:<filter1>+<filter2>+...<filterN>' combines
> +several filters. Only objects which are accepted by every filter are
> +included. Filters are joined by "+" and individual filters are %-encoded
> +(i.e. URL-encoded). Only the "%" and "+" characters must be encoded. For
> +instance, 'combine:tree:3+blob:none' and
> +'combine:tree%3A2+blob%3Anone' are equivalent, and result in only tree

I do not quite get this part.  The way I read the justification for
encoding given in the proposed commit log message was because a
filter name or filter parameter might want to include the '+' sign,
so a filter whose name is frotz+nitfol that takes a param may be
invoked standalone as "--filter=frotz+nitfol:param=1" needs to
protect its '+' from getting mistaken when used in the combine
filter, it should not be mistaken as a combination of 'frotz' filter
(with no parameter) and 'nitfol' filter (with param set to one), and
the way to do so is to say

	--filter="combine:frotz%2Bnitfol:param=1"

(this is a degenerate 'combination of a single filter', but you can
add another one by following it with '+' and another URLencoded
string).

So why are we allowing %3A there that does not even have to be
encoded?  Shouldn't it be an error?

In any case, I am not quite convinced that we need to complicate the
parameters with URLencoding, so I'd skip reviewing large part this
patch that is about "decoding".

Once the combined filter definition is built in-core, the code that
evaluates the intersection of all conditions seems to be written
sanely to me.

Thanks.



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux