Jonathan Tan <jonathantanmy@xxxxxxxxxx> writes: > Refactor, into a common function, the version and capability negotiation > done when invoking a long-running process as a clean or smudge filter. > This will be useful for other Git code that needs to interact similarly > with a long-running process. > > As you can see in the change to t0021, this commit changes the error > message reported when the long-running process does not introduce itself > with the expected "server"-terminated line. Originally, the error > message reports that the filter "does not support filter protocol > version 2", differentiating between the old single-file filter protocol > and the new multi-file filter protocol - I have updated it to something > more generic and useful. > > Signed-off-by: Jonathan Tan <jonathantanmy@xxxxxxxxxx> Overall I like the direction, even though the abstraction the resulting code results in seems to me a bit too tightly defined; in other words, I cannot be sure that this will be useful enough in a more general context, or make some potential applications feel a bit too constrained. > + static int versions[] = {2, 0}; > + static struct subprocess_capability capabilities[] = { > + {"clean", CAP_CLEAN}, {"smudge", CAP_SMUDGE}, {NULL, 0} > + }; > struct cmd2process *entry = (struct cmd2process *)subprocess; > ... > + return subprocess_handshake(subprocess, "git-filter-", versions, NULL, > + capabilities, > + &entry->supported_capabilities); > } I would have defined the welcome prefix to lack the final dash, i.e. forcing the hardcoded suffixes for clients and servers in any protocol that uses this API to end with "-client" and "-server", i.e. with dash. > diff --git a/sub-process.c b/sub-process.c > index a3cfab1a9..1a3f39bdf 100644 > --- a/sub-process.c > +++ b/sub-process.c > @@ -105,3 +105,97 @@ int subprocess_start(struct hashmap *hashmap, struct subprocess_entry *entry, co > hashmap_add(hashmap, entry); > return 0; > } > + > +int subprocess_handshake(struct subprocess_entry *entry, > + const char *welcome_prefix, > + int *versions, > + int *chosen_version, > + struct subprocess_capability *capabilities, > + unsigned int *supported_capabilities) { > + int version_scratch; > + unsigned int capabilities_scratch; > + struct child_process *process = &entry->process; > + int i; > + char *line; > + const char *p; > + > + if (!chosen_version) > + chosen_version = &version_scratch; > + if (!supported_capabilities) > + supported_capabilities = &capabilities_scratch; > + > + sigchain_push(SIGPIPE, SIG_IGN); > + > + if (packet_write_fmt_gently(process->in, "%sclient\n", > + welcome_prefix)) { > + error("Could not write client identification"); > + goto error; > + } > + for (i = 0; versions[i]; i++) { > + if (packet_write_fmt_gently(process->in, "version=%d\n", > + versions[i])) { > + error("Could not write requested version"); > + goto error; > + } > + } This forces version numbers to be positive integers, which is OK, as I do not see it a downside that any potential application cannot use "version=0". > + if (packet_flush_gently(process->in)) > + goto error; > + > + if (!(line = packet_read_line(process->out, NULL)) || > + !skip_prefix(line, welcome_prefix, &p) || > + strcmp(p, "server")) { > + error("Unexpected line '%s', expected %sserver", > + line ? line : "<flush packet>", welcome_prefix); > + goto error; > + } > + if (!(line = packet_read_line(process->out, NULL)) || > + !skip_prefix(line, "version=", &p) || > + strtol_i(p, 10, chosen_version)) { > + error("Unexpected line '%s', expected version", > + line ? line : "<flush packet>"); > + goto error; > + } > + for (i = 0; versions[i]; i++) { > + if (versions[i] == *chosen_version) > + goto version_found; > + } > + error("Version %d not supported", *chosen_version); > + goto error; > +version_found: It would have been more natural to do for (i = 0; versions[i]; i++) if (versions[i] == *chosen_version) break; if (versions[i]) { error("..."); goto error; } without "version_found:" label. In general, I'd prefer to avoid jumping to a label in the normal/expected case and reserve "goto" for error handling. > + if ((line = packet_read_line(process->out, NULL))) { > + error("Unexpected line '%s', expected flush", line); > + goto error; > + } > + > + for (i = 0; capabilities[i].name; i++) { > + if (packet_write_fmt_gently(process->in, "capability=%s\n", > + capabilities[i].name)) { > + error("Could not write requested capability"); > + goto error; > + } > + } > + if (packet_flush_gently(process->in)) > + goto error; > + > + while ((line = packet_read_line(process->out, NULL))) { > + if (!skip_prefix(line, "capability=", &p)) > + continue; > + > + for (i = 0; capabilities[i].name; i++) { > + if (!strcmp(p, capabilities[i].name)) { > + *supported_capabilities |= capabilities[i].flag; > + goto capability_found; > + } > + } > + warning("external filter requested unsupported filter capability '%s'", > + p); > +capability_found: > + ; Likewise. Also, this is the reason why I said this might make future applications feel a bit too constrained; is the set of fields in the subprocess_capability struct general enough? It can only say "a capability with this name was found" with a single bit, so you can have only 32 (or 64) capabilities that are all yes/no. I am not saying that is definitely insufficient (not yet anyway); I am wondering if future applications may need to have something like: capability=buffer-size=64k where "=64k" part is not known at this layer but is known by the user of the API. > + } > + > + sigchain_pop(SIGPIPE); > + return 0; > +error: > + sigchain_pop(SIGPIPE); > + return 1; I would prepare at the beginning of the function: int retval = -1; /* assume failure */ and rewrite the above to retval = 0; error: sigchain_pop(SIGPIPE); return retval; if I were writing this code.