Re: [PATCH 1/7] strbuf: add "include_delim" parameter to "strbuf_split"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Christian Couder <chriscool@xxxxxxxxxxxxx> writes:

> Yes, my patch does not do that, because I think including the delimiter is a 
> special case of the more general and useful behavior of not including it.

You got it backwards.

With the way the returned string is used by the single caller that your
patch adds (which splits at ","), I would agree that lack of delimiter
allows the result to get used directly in the further processing.

But even in that codepath, I have to say that it is just lazy programming
that the caller does not postprocess the returned value from the splitter
function.  If it wants not just accept input such as "a,b,c" but also
wants to tolerate things like "a, b, c", it will have to look at the
resulting string, and ignoring the delimiter at the end becomes just a
small part of the general clean-up process [*1*].

Once you start allowing "split at one of these characters" and/or "split
at delimiter that matches this pattern", you cannot just discard the
delimiter if you want to support non-simplistic callers, because they
would want to know what the delimiter was.

Stripping out the delimiter is the special case for simplistic callers
(think "gets()" that strips, and "fgets()" that doesn't).  A more general
solution should be by default not to strip it, and I do not think your new
caller, if it were written correctly, needs stripping behaviour either.
That means there is no need for the "optionally strip" flag to the
function in order to support the rest of the series [*2*].

Also comparing this with Perl/Python split() forgets that you are working
in C, where truncating an existing string is quite cheap (just assign '\0').
There is a different trade-off to be made in these language environments.

[Footnote]

*1* and this is not a made-up feature enhancement request.  If you tell
people that they can give more than one value with comma separated, some
of them _will_ feed you --option="a, b, c".  Your parser can error out by
saying "I do not understand ' b'", but you will be told "What other
possible interpretation is there for that thing, other than 'b'!", and
would grudgingly have to change your code to accept such a list.

*2* In any case, as I told you in a separate review comments, I think
passing a huge list as a command line parameter, be it separated with
comma or whatever, is not an appropriate way to solve the issue of
filter_skipped() your primary topic seems to be trying to address, so I
expect your series would not need strbuf_split() at all.  You would most
likely be calling for_each_ref(), looking at the refs under "refs/bisect"
hierarchy, instead of having shell feed you the list.


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux