Re: [PATCH 2/5] t/helper/test-hashmap.c: avoid using `strtok()`

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Apr 18, 2023 at 06:23:20AM -0400, Jeff King wrote:
> On Thu, Apr 13, 2023 at 07:31:46PM -0400, Taylor Blau wrote:
>
> > Avoid using the non-reentrant `strtok()` to separate the parts of each
> > incoming command. Instead of replacing it with `strtok_r()`, let's
> > instead use the more friendly `string_list_split_in_place_multi()`.
>
> Junio mentioned this offhand in his response, but I wanted to highlight
> one difference in before/after behavior here.
>
>   [before]
>   $ printf 'add foo    bar\niterate\n' | t/helper/test-tool hashmap
>   foo bar
>
>   [after]
>   $ printf 'add foo    bar\niterate\n' | t/helper/test-tool hashmap
>   foo    bar
>
> I think that's fine for this test script, but it may be an indication
> that we want string-list's split to support different semantics.

I agree that it's OK for the test scripts, but probably isn't the right
thing for all cases. In the reroll I'll send shortly, I reimplemented
this in a way that (a) doesn't change the behavior of
`string_list_split_in_place()`, and (b) allows us to easily change the
semantics for `string_list_split_in_place_multi()`.

> > @@ -159,21 +161,34 @@ int cmd__hashmap(int argc, const char **argv)
> >
> >  	/* process commands from stdin */
> >  	while (strbuf_getline(&line, stdin) != EOF) {
> > -		char *cmd, *p1 = NULL, *p2 = NULL;
> > +		char *cmd, *p1, *p2;
> >  		unsigned int hash = 0;
> >  		struct test_entry *entry;
> >
> > +		/*
> > +		 * Because we memdup() the arguments out of the
> > +		 * string_list before inserting them into the hashmap,
> > +		 * it's OK to set its length back to zero to avoid
> > +		 * re-allocating the items array once per line.
> > +		 *
> > +		 * By doing so, we'll instead overwrite the existing
> > +		 * entries and avoid re-allocating.
> > +		 */
> > +		parts.nr = 0;
>
> I think this is OK, but I wonder if we should wrap it in a function that
> makes sure the strings aren't owned by the strdup (and thus aren't being
> leaked). Something like:
>
>   void string_list_setlen(struct string_list *sl, size_t nr)
>   {
> 	/* alternatively, I guess we could actually free nr..sl->nr */
> 	if (sl->strdup_strings)
> 		BUG("you can't setlen on a string-list which owns its strings");
> 	if (nr > sl->nr)
> 		BUG("you can't grow a string-list with setlen");
> 	sl->nr = nr;
>   }
>
> That is probably overkill for these two test helpers, but I wonder if
> the pattern might escape into "real" code (especially if we suggest
> using string-list instead of strtok).

I like this suggestion. I added a new patch (for which you are listed
under Co-authored-by) which adds this function as-is.

Thanks,
Taylor



[Index of Archives]     [Linux Kernel Development]     [Gcc Help]     [IETF Annouce]     [DCCP]     [Netdev]     [Networking]     [Security]     [V4L]     [Bugtraq]     [Yosemite]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Linux SCSI]     [Fedora Users]

  Powered by Linux