On Thu, Apr 13, 2023 at 06:39:18PM -0700, Junio C Hamano wrote: > Taylor Blau <me@xxxxxxxxxxxx> writes: > > > - `strtok_r()` forces the caller to maintain an extra string pointer > > to pass as its `saveptr` value > > > > - `strtok_r()` also requires that its `saveptr` value be unmodified > > between calls. > > > > - `strtok()` (and by extension, `strtok_r()`) is confusing when used > > across multiple functions, since the caller is supposed to pass NULL > > as its first argument after the first call. This makes it difficult > > to determine what string is actually being tokenized without clear > > dataflow. > > It seems that the only existing users of strtok() are all in > t/helper/ directory, so I personally do not think it is a huge loss > if these two are forbidden. While I do not see why we should use > strtok(), none of the above sound like sensible reasons to ban > strtok_r(). At best, they may point out awkwardness of the function > to make you try finding an alternative that is easier-to-use before > choosing strtok_r() for your application on a case-by-case basis. For what it's worth, I could certainly live if we accomplished getting strtok(2) on the banned list, but left strtok_r(2) un-banned. TBH, I think that leaving the reenterant version of a banned function as un-banned is a little awkward, I don't mind it if you don't feel like the above are sufficient reasons to ban it. > If your application wants to chomp a string into tokens from left to > right, inspecting the resulting token one-by-one as it goes until it > hits a token that satisfies some condition and then terminate > without wasting cycles on the rest, string_list_split_in_place() is > a poor choice. In such a use case, you do not know upfront where in > the string the sought-after token would be, so you have to split the > string in full without taking an early exit via maxsplit. Also, you > are restricted to a single byte value for the delimiter, and unlike > strtok[_r](), string_list_split_in_place() does not squash a run of > delimiter bytes into one inter-token delimiter. I don't quite agree with this. In practice, you could repeatedly call `string_list_split_in_place()` with maxsplit of "1", using the tail of the string list you're splitting into as the string to split. That would allow you to split tokens one at a time into the string list without having to split the whole line up front. That all said, I don't think that we have such a use case in the tree, at least from my searching for strtok() and string_list_split_in_place(). It may be that using strtok_r() for such a thing would be less awkward: not having tried both (and having no examples to reference) I honestly do not know for certain. > One gripe I have against use of strtok() is actually not with > threading but because people often misuse it when strcspn() is what > they want (i.e. measure the length of the "first token", so that > they can then xmemdupz() a copy out), forgetting that strtok[_r]() > is destructive. Heh, I'm happy to add that to this, if you want ;-). Thanks, Taylor