Re: [PATCH v4 1/2] fetch-pack: redact packfile urls in traces

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



"Ivan Frade via GitGitGadget" <gitgitgadget@xxxxxxxxx> writes:

> From: Ivan Frade <ifrade@xxxxxxxxxx>
>
> In some setups, packfile uris act as bearer token. It is not
> recommended to expose them plainly in logs, although in special
> circunstances (e.g. debug) it makes sense to write them.
>
> Redact the packfile URL paths by default, unless the GIT_TRACE_REDACT
> variable is set to false. This mimics the redacting of the Authorization
> header in HTTP.
>
> Signed-off-by: Ivan Frade <ifrade@xxxxxxxxxx>
> ---
>  Documentation/git.txt  |  5 +++--
>  fetch-pack.c           |  3 +++
>  pkt-line.c             | 40 ++++++++++++++++++++++++++++++++-
>  pkt-line.h             |  1 +
>  t/t5702-protocol-v2.sh | 51 ++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 97 insertions(+), 3 deletions(-)
>
> diff --git a/Documentation/git.txt b/Documentation/git.txt
> index d63c65e67d8..f64c8ce5183 100644
> --- a/Documentation/git.txt
> +++ b/Documentation/git.txt
> @@ -832,8 +832,9 @@ for full details.
>  
>  `GIT_TRACE_REDACT`::
>  	By default, when tracing is activated, Git redacts the values of
> -	cookies, the "Authorization:" header, and the "Proxy-Authorization:"
> -	header. Set this variable to `0` to prevent this redaction.
> +	cookies, the "Authorization:" header, the "Proxy-Authorization:"
> +	header and packfile URLs. Set this variable to `0` to prevent this
> +	redaction.

Just a curiosity.  Do we call these packfile URI, or packfile URL?

> diff --git a/pkt-line.c b/pkt-line.c
> index 2dc8ac274bd..ba0a2d65f0c 100644
> --- a/pkt-line.c
> +++ b/pkt-line.c
> @@ -370,6 +370,31 @@ int packet_length(const char lenbuf_hex[4])
>  	return (val < 0) ? val : (val << 8) | hex2chr(lenbuf_hex + 2);
>  }
>  
> +static char *find_url_path(const char* buffer, int *path_len)
> +{
> +	const char *URL_MARK = "://";
> +	char *path = strstr(buffer, URL_MARK);
> +	if (!path)
> +		return NULL;

Hmph, the format we expect is "<hash> <uri>"; don't we need to
validate the leading <hash> followed by SP?

    len = strspn(buffer, "0123456789abcdefABCDEF");
    if (len != 40 || len != 64 || buffer[len] != ' ')
	return NULL; /* required "<hash> SP" not seen */
    path = strstr(buffer + len + 1, URL_MARK);

or somesuch?

> +	path += strlen(URL_MARK);

OK.

> +	while (*path && *path != '/')
> +		path++;

strchr()?

> +	if (!*path || !*(path + 1))
> +		return NULL;

OK.

> +	// position after '/'

No // comments in our codebase, please.  Unless it is a borrowed
code, that is.

> +	path++;
> +
> +	if (path_len) {
> +		char *url_end = strchrnul(path, ' ');

Is this because SP is not a valid character in packfile URI, or at
this point in the callchain it would be encoded or something?  The
format we expect is "<hash> <uri>", so we shouldn't even have to
look for SP but just redact everything to the end, no?

Apparently we are assuming that there won't be more than one such
URL-path that needs redacting in the packet, but that is perfectly
fine, as the sole goal of this helper is to identify the packfile
URI packet and redact it in the log.

> +		*path_len = url_end - path;
> +	}
> +
> +	return path;
> +}

> -	packet_trace(buffer, len, 0);
> +	if (options & PACKET_READ_REDACT_URL_PATH &&
> +	    (url_path_start = find_url_path(buffer, &url_path_len))) {
> +		const char *redacted = "<redacted>";
> +		struct strbuf tracebuf = STRBUF_INIT;
> +		strbuf_insert(&tracebuf, 0, buffer, len);
> +		strbuf_splice(&tracebuf, url_path_start - buffer,
> +			      url_path_len, redacted, strlen(redacted));
> +		packet_trace(tracebuf.buf, tracebuf.len, 0);
> +		strbuf_release(&tracebuf);

I briefly wondered if the repeated allocation (and more
fundamentally, preparing the redacted copy of packet whether we are
actually tracing the packet in the first place) is blindly wasting
the resources too much, but this only happens in the protocol header
part, so it might be OK.

Even if that is not the case, we should be able to update
fetch_pack.c::do_fetch_pack_v2() so that the REDACT_URL_PATH bit is
turned on in a much narrower region of code, right?  Enable when we
enter the GET_PACK state and drop the bit when we are done with the
packfile URI packets, or something?

Thanks for working on this.


> +	} else {
> +		packet_trace(buffer, len, 0);
> +	}



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux