Re: [PATCH v2 1/6] version: refactor redact_non_printables()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 17, 2025 at 11:56 PM Junio C Hamano <gitster@xxxxxxxxx> wrote:
>
> Usman Akinyemi <usmanakinyemi202@xxxxxxxxx> writes:
>
> > The git_user_agent_sanitized() function performs some sanitizing to
> > avoid special characters being sent over the line and possibly messing
> > up with the protocol or with the parsing on the other side.
> >
> > Let's extract this sanitizing into a new redact_non_printables() function,
> > as we will want to reuse it in a following patch.
> >
> > For now the new redact_non_printables() function is still static as
> > it's only needed locally.
> >
> > While at it, let's use strbuf_detach() to explicitly detach the string
> > contained by the 'buf' strbuf.
> >
> > Mentored-by: Christian Couder <chriscool@xxxxxxxxxxxxx>
> > Signed-off-by: Usman Akinyemi <usmanakinyemi202@xxxxxxxxx>
> > ---
> >  version.c | 22 ++++++++++++++++------
> >  1 file changed, 16 insertions(+), 6 deletions(-)
> >
> > diff --git a/version.c b/version.c
> > index 4d763ab48d..78f025c808 100644
> > --- a/version.c
> > +++ b/version.c
> > @@ -6,6 +6,20 @@
> >  const char git_version_string[] = GIT_VERSION;
> >  const char git_built_from_commit_string[] = GIT_BUILT_FROM_COMMIT;
> >
> > +/*
> > + * Trim and replace each character with ascii code below 32 or above
> > + * 127 (included) using a dot '.' character.
>
> /*
>  * Trim and replace each byte outside ASCII printable
>  * (33 to 127, inclusive) with a dot '.'.
>  */
>
> perhaps?
This sounds confusing, it sounds like the byte we are replacing with dot are
in the range of 33 to 127 whereas, it is those outside these range.
>
> > + * TODO: ensure consecutive non-printable characters are only replaced once
>
> I am not sure what your plans are for this change.  Has the list
> reached the consensus to squish consecutive redaction dots into one
> in the user-agent string?  If not, let's not mention it.  Making an
> incompatible change to the user-agent string is not the primary aim
> of this topic anyway.
>
> > +*/
>
> Funny indentation.  The asterisk should have a SP before it, just
> like on the previous lines.
Mistake, thanks for catching it, will make a change to it in the next
patch series.
>
> > +static void redact_non_printables(struct strbuf *buf)
> > +{
> > +     strbuf_trim(buf);
> > +     for (size_t i = 0; i < buf->len; i++) {
> > +             if (buf->buf[i] <= 32 || buf->buf[i] >= 127)
>
> <sane-ctype.h> defines isprint() we can use here.
I think it would be better to add this in another commit so that one commit
does one thing. I will add it after this patch series got settled,
what do you think ?
>
> > +                     buf->buf[i] = '.';
> > +     }
> > +}
>
> Do we want to do anything special when the resulting buf->buf[]
> becomes empty or just full of dots without anything else?  Should
> the caller be told about such a condition, or is it callers'
> responsibility to check if they care?  I am inclined to say that it
> is the latter.
I agreed.
>
> > @@ -27,12 +41,8 @@ const char *git_user_agent_sanitized(void)
> >               struct strbuf buf = STRBUF_INIT;
> >
> >               strbuf_addstr(&buf, git_user_agent());
> > -             strbuf_trim(&buf);
> > -             for (size_t i = 0; i < buf.len; i++) {
> > -                     if (buf.buf[i] <= 32 || buf.buf[i] >= 127)
> > -                             buf.buf[i] = '.';
> > -             }
> > -             agent = buf.buf;
> > +             redact_non_printables(&buf);
> > +             agent = strbuf_detach(&buf, NULL);
> >       }
> >
> >       return agent;





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux