Re: [PATCH/RFC 06/10] strbuf: introduce strbuf_split_str_without_term()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Nov 11, 2015 at 2:44 PM, Karthik Nayak <karthik.188@xxxxxxxxx> wrote:
> The current implementation of 'strbuf_split_buf()' includes the
> terminator at the end of each strbuf post splitting. Include an option

s/Include an/Add an/

> wherein we can drop the terminator if required. In this context

s/required/desired/

> introduce a wrapper function 'strbuf_split_str_without_term()' which
> splits a given string into strbufs without including the terminator.
>
> Signed-off-by: Karthik Nayak <Karthik.188@xxxxxxxxx>
> ---
> diff --git a/ref-filter.c b/ref-filter.c
> @@ -892,14 +892,11 @@ static void populate_value(struct ref_array_item *ref)
>                          * TODO: Implement a function similar to strbuf_split_str()
>                          * which would omit the separator from the end of each value.
>                          */
> -                       s = to_free = strbuf_split_str(valp, ',', 0);
> +                       s = to_free = strbuf_split_str_without_term(valp, ',', 0);
>
>                         align->position = ALIGN_LEFT;
>
>                         while (*s) {
> -                               /*  Strip trailing comma */
> -                               if (s[1])
> -                                       strbuf_setlen(s[0], s[0]->len - 1);

I'd prefer to see this ref-filter.c change split out as a separate
patch so as not to pollute the otherwise single-purpose change
introduced by this patch (i.e. capability to omit the terminator).

Also, it might make sense to move this patch to the head of the
series, since it's conceptually distinct from the rest of the patches,
and could conceivably prove useful on its own, regardless of how the
rest of the series fares.

>                                 if (!strtoul_ui(s[0]->buf, 10, (unsigned int *)&width))
>                                         ;
>                                 else if (!strcmp(s[0]->buf, "left"))
> diff --git a/strbuf.c b/strbuf.c
> @@ -115,7 +115,7 @@ void strbuf_tolower(struct strbuf *sb)
>  }
>
>  struct strbuf **strbuf_split_buf(const char *str, size_t slen,
> -                                int terminator, int max)
> +                                int terminator, int max, int with_term)

"with_term" might undesirably be interpreted as meaning "use this
particular term". Perhaps a different name, such as "include_term",
"drop_term", or "omit_term" would be a bit less ambiguous. (I think I
prefer "omit_term".)

>  {
>         struct strbuf **ret = NULL;
>         size_t nr = 0, alloc = 0;
> @@ -123,18 +123,27 @@ struct strbuf **strbuf_split_buf(const char *str, size_t slen,
>
>         while (slen) {
>                 int len = slen;
> +               int term = with_term;

"term" is not a great variable name, and is easily confused with the
existing "terminator" input argument. This is really being used as a
length adjustment, so perhaps a name such as 'term_adjust' or
'len_adjust' or something better would be preferable.

Also, since the value of 'with_term' never changes, then 'term' will
have the same value each time through the loop, thus you could
(should) hoist the declaration and initialization of 'term' outside of
the loop.

Due to the way you're using this variable ('term'), you want its value
always to be 0 or 1 but you don't do anything to ensure that. What if
the user passes in 42 rather than 0 or 1? That would mess up your
(below) calculations. Worse, what if the user passes in -42? That
would be particularly alarming. To turn this into a boolean value (0
or 1), do this instead:

    int term = !!with_term;

>                 if (max <= 0 || nr + 1 < max) {
>                         const char *end = memchr(str, terminator, slen);
>                         if (end)
> -                               len = end - str + 1;
> +                               len = end - str + term;
> +                       else
> +                               /*  When no terminator present, we must add the last character */
> +                               term = 1;
>                 }
>                 t = xmalloc(sizeof(struct strbuf));
>                 strbuf_init(t, len);
>                 strbuf_add(t, str, len);
>                 ALLOC_GROW(ret, nr + 2, alloc);
>                 ret[nr++] = t;
> -               str += len;
> -               slen -= len;
> +               if (!term) {
> +                       str += len + 1;
> +                       slen -= len + 1;
> +               } else {
> +                       str += len;
> +                       slen -= len;
> +               }

This new logic is complex and confusing, thus difficult to review for
correctness. Rather than messing with 'len' and the existing logic,
how about instead, just adjusting the amount you store in the strbuf?
That is, instead of all the above changes, you might be able to get by
with one little change, something like this (untested):

    - strbuf_add(t, str, len);
    + strbuf_add(t, str, len - !!end * !!with_term);

>         }
>         ALLOC_GROW(ret, nr + 1, alloc); /* In case string was empty */
>         ret[nr] = NULL;
> diff --git a/strbuf.h b/strbuf.h
> @@ -465,19 +465,25 @@ static inline int strbuf_strip_suffix(struct strbuf *sb, const char *suffix)
>   * For lighter-weight alternatives, see string_list_split() and
>   * string_list_split_in_place().
>   */
> -extern struct strbuf **strbuf_split_buf(const char *, size_t,
> -                                       int terminator, int max);
> +extern struct strbuf **strbuf_split_buf(const char *str, size_t slen,
> +                                       int terminator, int max, int with_term);

You also need to update the comment block above this declaration since
it still says that each substring includes the terminator. It also
fails to mention the new 'with_term' argument added by this patch.

> +static inline struct strbuf **strbuf_split_str_without_term(const char *str,
> +                                                           int terminator, int max)

This is an uncomfortably long function name. Unfortunately, short and
sweet strbuf_split() is already taken. Perhaps
strbuf_split_str_drop_term()? strbuf_split_str_omit_term()?
strbuf_split_str_no_term()? strbuf_split_noterm()?

> +{
> +       return strbuf_split_buf(str, strlen(str), terminator, max, 0);
> +}
>
>  static inline struct strbuf **strbuf_split_str(const char *str,
>                                                int terminator, int max)
>  {
> -       return strbuf_split_buf(str, strlen(str), terminator, max);
> +       return strbuf_split_buf(str, strlen(str), terminator, max, 1);
>  }
>
>  static inline struct strbuf **strbuf_split_max(const struct strbuf *sb,
>                                                 int terminator, int max)
>  {
> -       return strbuf_split_buf(sb->buf, sb->len, terminator, max);
> +       return strbuf_split_buf(sb->buf, sb->len, terminator, max, 1);
>  }
>
>  static inline struct strbuf **strbuf_split(const struct strbuf *sb,
> --
> 2.6.2
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]