Re: [PATCH v2 1/6] string-list: introduce `string_list_split_in_place_multi()`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 22.04.23 um 13:12 schrieb Jeff King:
> On Tue, Apr 18, 2023 at 03:18:43PM -0400, Taylor Blau wrote:
>
>> Introduce a variant of the `string_list_split_in_place()` function that
>> takes a string of accepted delimiters.
>>
>> By contrast to its cousin `string_list_split_in_place()` which splits
>> the given string at every instance of the single character `delim`, the
>> `_multi` variant splits the given string any any character appearing in
>> the string `delim`.
>>
>> Like `strtok()`, the `_multi` variant skips past sequential delimiting
>> characters. For example:
>>
>>     string_list_split_in_place(&xs, xstrdup("foo::bar::baz"), ":", -1);
>>
>> would place in `xs` the elements "foo", "bar", and "baz".

Which one is it here really, string_list_split_in_place() or _multi()?

> I have mixed feelings on this.
>
> There are multiple orthogonal options here: single/multiple delimiter,
> and how to deal with sequential delimiters (you call it "runs" here,
> though I think of it inverted as "allow empty fields"). Plus existing
> ones like maxsplit.
>
> Your underlying function, string_list_split_in_place_1(), handles these
> independently. But it seems like a subtle gotcha that
> string_list_split_in_place() and its _multi() variant, which purport to
> differ only in one dimension (representation of delimiter list), also
> subtly differ in another dimension. Especially because it's a facet that
> might not come up in simple tests, I can see somebody incorrectly
> applying one or the other.
>
> Obviously one solution is to add the "runs" option to all variants. But
> I'd be hesitant to burden existing callers. So I'd propose one of:
>
>   1. Make your _1() function public, with a name like _with_options() or
>      something (though the function name is sadly already quite long).
>      Leave string_list_split_in_place() as a wrapper that behaves as
>      now, and have the few new callers use the with_options() variant.
>
>   2. Don't bother implementing the "runs" version. The only users would
>      be test programs, and nobody cares much either way for their cases.
>      Document in the commit message (and possibly above the function)
>      that this isn't a strict replacement for strtok(). That would
>      hopefully be enough for a potential caller to think about the
>      behavior, and we can add "runs" if and when somebody actually wants
>      it.

You can call string_list_remove_empty_items() to get rid of empty
strings.  And if the single-char version calls a multi-char version
under the hood then its only advantage is backward compatibility -- but
converting all callers isn't that hard.  So the only string_list split
function version you really need accepts multiple delimiters and
preserves empty items and can be called string_list_split_in_place().

René





[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux