Re: [PATCH v3 01/11] argv_array: offer to split a string by whitespace

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Stefan,

On Fri, 20 Apr 2018, Stefan Beller wrote:

> On Fri, Apr 20, 2018 at 3:20 PM, Johannes Schindelin
> <johannes.schindelin@xxxxxx> wrote:
> > This is a simple function that will interpret a string as a whitespace
> > delimited list of values, and add those values into the array.
> >
> > Note: this function does not (yet) offer to split by arbitrary delimiters,
> > or keep empty values in case of runs of whitespace, or de-quote Unix shell
> > style. All fo this functionality can be added later, when and if needed.
> >
> > Signed-off-by: Johannes Schindelin <johannes.schindelin@xxxxxx>
> > ---
> >  argv-array.c | 20 ++++++++++++++++++++
> >  argv-array.h |  1 +
> >  2 files changed, 21 insertions(+)
> >
> > diff --git a/argv-array.c b/argv-array.c
> > index 5d370fa3366..cb5bcd2c064 100644
> > --- a/argv-array.c
> > +++ b/argv-array.c
> > @@ -64,6 +64,26 @@ void argv_array_pop(struct argv_array *array)
> >         array->argc--;
> >  }
> >
> > +void argv_array_split(struct argv_array *array, const char *to_split)
> > +{
> > +       while (isspace(*to_split))
> > +               to_split++;
> > +       for (;;) {
> > +               const char *p = to_split;
> > +
> > +               if (!*p)
> > +                       break;
> > +
> > +               while (*p && !isspace(*p))
> > +                       p++;
> > +               argv_array_push_nodup(array, xstrndup(to_split, p - to_split));
> > +
> > +               while (isspace(*p))
> > +                       p++;
> > +               to_split = p;
> > +       }
> > +}
> 
> The code looks correct to me.
> 
> Though this seems so low level, that I find it hard to accept
> to implement yet another low level split function.
> Would reuse of strbuf_split or string_list_split make sense?
> 
> Looking at the user in patch 5 you really want to have the
> argv array, though, so it cannot be pushed to an even higher
> abstraction layer and be solved there. You really want a
> string -> argv array split, which would mean we'd have to
> do the split via string -> {strbufs, stringlist} and then perform
> a conversion from that to argv array and both conversions
> look ugly as we'd need to iterate their specific data structure
> and push each element itself again.

Maybe we could reconcile all of this by introducing yet another layer of
indirection? I am wary of this, as I tend to think that layers of
indirection make things tedious to debug.

But I could imagine that we could introduce something like

    typedef int (*split_string_fn)(const char *item, size_t len, void *data);
    int split_string(const char *string, int delimiter, int maxsplit,
		     split_string_fn fn, void *fn_data);

And then argv_array_split() would be implemented like this:

    struct argv_array_split_data {
	struct argv_array *array;
    };

    static int argv_array_split_fn(const char *item, size_t len, void *data)
    {
	struct argv_array *array =
	    ((struct argv_array_split_data *)data)->array;
	argv_array_push_nodup(array, xstrndup(item, len));
	return 0;
    }

    void argv_array_split(struct argv_array *array, const char *string,
			  int delimiter, int maxsplit)
    {
	struct argv_array_split_data data;
	data.array = array;
	split_string(string, delimiter, maxsplit, argv_array_split_fn, &data);
    }

Possible? Yes. Desirable? Dunno. Looks like a lot of effort for little
gain so far.

> So I guess we rather implement it yet another time.
> 
> Looking at their function declarations:
> 
> /*
>  * lots of very good comments for string list splitting
>  */
> int string_list_split(struct string_list *list, const char *string,
>           int delim, int maxsplit);
> 
> /*
>  * lots of very good comments for strbuf splitting
>  */
> static inline struct strbuf **strbuf_split(const struct strbuf *sb,
>            int terminator)
> 
> I find they have comments in common as well as the
> terminator. In the commit message you defer the
> implementation of a terminating symbol, as we
> can do it later. Also the isspace takes more than one
> delimiter, which the others do not.
> 
> I am debating myself if I want to ask for a comment, as
> argv-array.h is very short for now and it would be consistent
> not to add a comment.

That was my thinking.

And I thought further: once argv_array_split() becomes more complex, we
would need to consider again whether we want that level of indirection
(and consolidation), and then also whether we need a comment.

On the other hand, argv_array's name suggests that it handles
command-lines, so I think you are correct that at least a little comment
is needed to state that this does not handle quoted arguments.

Ciao,
Dscho



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux