"Ivan Frade via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes: > From: Ivan Frade <ifrade@xxxxxxxxxx> > > In some setups, packfile uris act as bearer token. It is not > recommended to expose them plainly in logs, although in special > circunstances (e.g. debug) it makes sense to write them. > > Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT > variable is set to false. This mimics the redacting of the Authorization > header in HTTP. > > Signed-off-by: Ivan Frade <ifrade@xxxxxxxxxx> > --- > Documentation/git.txt | 5 +++-- > fetch-pack.c | 3 +++ > pkt-line.c | 40 ++++++++++++++++++++++++++++++++- > pkt-line.h | 1 + > t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++ > 5 files changed, 97 insertions(+), 3 deletions(-) > > diff --git a/Documentation/git.txt b/Documentation/git.txt > index d63c65e67d8..f64c8ce5183 100644 > --- a/Documentation/git.txt > +++ b/Documentation/git.txt > @@ -832,8 +832,9 @@ for full details. > > `GIT_TRACE_REDACT`:: > By default, when tracing is activated, Git redacts the values of > - cookies, the "Authorization:" header, and the "Proxy-Authorization:" > - header. Set this variable to `0` to prevent this redaction. > + cookies, the "Authorization:" header, the "Proxy-Authorization:" > + header and packfile URLs. Set this variable to `0` to prevent this > + redaction. Just a curiosity. Do we call these packfile URI, or packfile URL? > diff --git a/pkt-line.c b/pkt-line.c > index 2dc8ac274bd..ba0a2d65f0c 100644 > --- a/pkt-line.c > +++ b/pkt-line.c > @@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4]) > return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2); > } > > +static char *find_url_path(const char* buffer, int *path_len) > +{ > + const char *URL_MARK = "://"; > + char *path = strstr(buffer, URL_MARK); > + if (!path) > + return NULL; Hmph, the format we expect is "<hash> <uri>"; don't we need to validate the leading <hash> followed by SP? len = strspn(buffer, "0123456789abcdefABCDEF"); if (len != 40 || len != 64 || buffer[len] != ' ') return NULL; /* required "<hash> SP" not seen */ path = strstr(buffer + len + 1, URL_MARK); or somesuch? > + path += strlen(URL_MARK); OK. > + while (*path && *path != '/') > + path++; strchr()? > + if (!*path || !*(path + 1)) > + return NULL; OK. > + // position after '/' No // comments in our codebase, please. Unless it is a borrowed code, that is. > + path++; > + > + if (path_len) { > + char *url_end = strchrnul(path, ' '); Is this because SP is not a valid character in packfile URI, or at this point in the callchain it would be encoded or something? The format we expect is "<hash> <uri>", so we shouldn't even have to look for SP but just redact everything to the end, no? Apparently we are assuming that there won't be more than one such URL-path that needs redacting in the packet, but that is perfectly fine, as the sole goal of this helper is to identify the packfile URI packet and redact it in the log. > + *path_len = url_end - path; > + } > + > + return path; > +} > - packet_trace(buffer, len, 0); > + if (options & PACKET_READ_REDACT_URL_PATH && > + (url_path_start = find_url_path(buffer, &url_path_len))) { > + const char *redacted = "<redacted>"; > + struct strbuf tracebuf = STRBUF_INIT; > + strbuf_insert(&tracebuf, 0, buffer, len); > + strbuf_splice(&tracebuf, url_path_start - buffer, > + url_path_len, redacted, strlen(redacted)); > + packet_trace(tracebuf.buf, tracebuf.len, 0); > + strbuf_release(&tracebuf); I briefly wondered if the repeated allocation (and more fundamentally, preparing the redacted copy of packet whether we are actually tracing the packet in the first place) is blindly wasting the resources too much, but this only happens in the protocol header part, so it might be OK. Even if that is not the case, we should be able to update fetch_pack.c::do_fetch_pack_v2() so that the REDACT_URL_PATH bit is turned on in a much narrower region of code, right? Enable when we enter the GET_PACK state and drop the bit when we are done with the packfile URI packets, or something? Thanks for working on this. > + } else { > + packet_trace(buffer, len, 0); > + }