> On May 16, 2019, at 8:25 PM, Junio C Hamano <gitster@xxxxxxxxx> wrote: >> >> $ git rev-list --filter=tree:2 --filter:blob:limit=32k > > Shouldn't the second one say "--filter=blob:limit=32k" (i.e. the > first colon should be an equal sign)? That's right. Fixed locally. > >> Such usage is currently an error, so giving it a meaning is backwards- >> compatible. > > Two minor comments. > > If combine means "must satisfy all of these", '+' is probably a poor > choice (perhaps we want '&' instead). Also, it seems to me that I think I agree. & is more intuitive. > having to worry about url encoding and parsing encoded data > correctly and securely would be far more work than simply taking > multiple command line parameters, accumulating them in a string > list, and then at the end of command line parsing, building a > combined filter out of all of them at once (a degenerate case may > end up attempting to build a combined filter that combines a single > filter), iow just biting the bullet and do the "potentially be > improved" step from the beginning. My intention actually is to support the repeated flag pretty soon, but I only want to write the code if there's agreement on my current approach. My justification for the URL-encoding scheme is: 1. The combined filters will eventually have to travel over the wire. 2. The Git protocol will either have repeated "filter" lines or it will continue to use a single filter line with an encoding scheme. 3. Continuing to use a single filter line seemed the least disruptive considering both this codebase and Git clones like JGit. Other clones will likely fail saying "unknown filter combine:" or something like that until it gets implemented. A paranoid consideration is that clones and proprietary server implementations may currently allow the "filter" line to be silently overridden if it is repeated. 4. Assuming we *do* use a single filter line over the wire, it makes sense to allow the user to specify the raw filter line as well as have the more friendly UI of repeating --filter flags. 5. If we use repeated "filter" lines over the wire, and later start implementing a more complete DSL for specifying filters (see Mercurial's "revsets") the repeated-filter-line feature in the protocol may end up becoming deprecated and we will end up back-pedaling to allow integration of the "&" operator with whatever new operators we need. (I very much doubt I will be the one implementing such a DSL for filters or resets, but I think it's a possibility) > So why are we allowing %3A there that does not even have to be > encoded? Shouldn't it be an error? We do have to require the combine operator (& or +) and % be encoded. For other operators, there are three options: 1. Allow anything to be encoded. I chose this because it's how I usually think of URL encoding working. For instance, if I go to https://public-inbox.org/git/?q=cod%65+coverage in Chrome, the browser automatically decodes the %65 to an e in the address bar. Safari does not automatically decode, but the server apparently interprets the %65 as an e. I am not really attached to this choice. 2. Do not allow or require anything else to be encoded. 3. Require encoding of a couple of "reserved" characters that don't appear in filters now, and don't typically appear in UNIX path names. This would allow for expansion later. For instance, "~&%*+|(){}!\" plus the ASCII range [0, 0x20] and single and double quotes - do not allow encoding of anything else. 4. Same requirements as 3, but permit encoding of other arbitrary characters. I kind of like 3 now that I've thought it out more. > > In any case, I am not quite convinced that we need to complicate the > parameters with URLencoding, so I'd skip reviewing large part this > patch that is about "decoding". It's fine if we drop the encoding scheme. I intentionally tried to limit the amount of work I stacked on top of it until I got agreement. Please let me know if anything I've said changes your perspective. > > Once the combined filter definition is built in-core, the code that > evaluates the intersection of all conditions seems to be written > sanely to me. Great! I actually did simplify it a bit since I sent the first roll-up. Thanks.