On Fri, Jan 17, 2025 at 11:56 PM Junio C Hamano <gitster@xxxxxxxxx> wrote: > > Usman Akinyemi <usmanakinyemi202@xxxxxxxxx> writes: > > > The git_user_agent_sanitized() function performs some sanitizing to > > avoid special characters being sent over the line and possibly messing > > up with the protocol or with the parsing on the other side. > > > > Let's extract this sanitizing into a new redact_non_printables() function, > > as we will want to reuse it in a following patch. > > > > For now the new redact_non_printables() function is still static as > > it's only needed locally. > > > > While at it, let's use strbuf_detach() to explicitly detach the string > > contained by the 'buf' strbuf. > > > > Mentored-by: Christian Couder <chriscool@xxxxxxxxxxxxx> > > Signed-off-by: Usman Akinyemi <usmanakinyemi202@xxxxxxxxx> > > --- > > version.c | 22 ++++++++++++++++------ > > 1 file changed, 16 insertions(+), 6 deletions(-) > > > > diff --git a/version.c b/version.c > > index 4d763ab48d..78f025c808 100644 > > --- a/version.c > > +++ b/version.c > > @@ -6,6 +6,20 @@ > > const char git_version_string[] = GIT_VERSION; > > const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT; > > > > +/* > > + * Trim and replace each character with ascii code below 32 or above > > + * 127 (included) using a dot '.' character. > > /* > * Trim and replace each byte outside ASCII printable > * (33 to 127, inclusive) with a dot '.'. > */ > > perhaps? This sounds confusing, it sounds like the byte we are replacing with dot are in the range of 33 to 127 whereas, it is those outside these range. > > > + * TODO: ensure consecutive non-printable characters are only replaced once > > I am not sure what your plans are for this change. Has the list > reached the consensus to squish consecutive redaction dots into one > in the user-agent string? If not, let's not mention it. Making an > incompatible change to the user-agent string is not the primary aim > of this topic anyway. > > > +*/ > > Funny indentation. The asterisk should have a SP before it, just > like on the previous lines. Mistake, thanks for catching it, will make a change to it in the next patch series. > > > +static void redact_non_printables(struct strbuf *buf) > > +{ > > + strbuf_trim(buf); > > + for (size_t i = 0; i < buf->len; i++) { > > + if (buf->buf[i] <= 32 || buf->buf[i] >= 127) > > <sane-ctype.h> defines isprint() we can use here. I think it would be better to add this in another commit so that one commit does one thing. I will add it after this patch series got settled, what do you think ? > > > + buf->buf[i] = '.'; > > + } > > +} > > Do we want to do anything special when the resulting buf->buf[] > becomes empty or just full of dots without anything else? Should > the caller be told about such a condition, or is it callers' > responsibility to check if they care? I am inclined to say that it > is the latter. I agreed. > > > @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void) > > struct strbuf buf = STRBUF_INIT; > > > > strbuf_addstr(&buf, git_user_agent()); > > - strbuf_trim(&buf); > > - for (size_t i = 0; i < buf.len; i++) { > > - if (buf.buf[i] <= 32 || buf.buf[i] >= 127) > > - buf.buf[i] = '.'; > > - } > > - agent = buf.buf; > > + redact_non_printables(&buf); > > + agent = strbuf_detach(&buf, NULL); > > } > > > > return agent;