On Sat, 2 Dec 2023 at 18:05, Alejandro Colomar via Gcc-help <gcc-help@xxxxxxxxxxx> wrote: > > On Sat, Dec 02, 2023 at 01:29:01PM +0100, Alejandro Colomar wrote: > > On Sat, Dec 02, 2023 at 12:50:28PM +0100, Alejandro Colomar wrote: > > > Hi, > > > > > > I've been implementing my own copy of strto[iu](3bsd), to avoid the > > > complexity of calling strtol(3) et al. In the process, I've noticed > > > that all of these functions use restrict for their parameters. > > > > > > Why do these functions use restrict? While the second parameter is not > > > used for accessing nptr memory (**endptr is not accessed), it can point > > > to the same memory. Here is an example of how these functions can have > > > pointers to the same memory in the two arguments. > > > > > > l = strtol(p, &p, 0); > > > > > > The use of restrict in the prototype of the function could result in > > > compiler warnings, no? Currently, I don't see any warnings, but I > > > suspect the compiler could complain, since the same memory is available > > > to the function via two different arguments (albeit with a different > > > number of references). > > > > > > The use of restrict in the definition of the function doesn't help the > > > optimizer, since it already knows that the second parameter is out-only, > > > so even if it weren't restrict, the only way to access memory is via the > > > first parameter. > > > > In the case of strto[iu](3bsd), I have even more doubts. > > > > Here's libbsd's version of it (omitting unimportant parts): > > > > $ grepc -tfd strtoi . > > ./src/strtoi.c:intmax_t > > strtoi(const char *__restrict nptr, > > char **__restrict endptr, int base, > > intmax_t lo, intmax_t hi, int *rstatus) > > { > > ... > > > > im = strtoimax(nptr, endptr, base); > > > > *rstatus = errno; > > errno = serrno; > > > > if (*rstatus == 0) { > > /* No digits were found */ > > if (nptr == *endptr) > > *rstatus = ECANCELED; > > /* There are further characters after number */ > > else if (**endptr != '\0') > > *rstatus = ENOTSUP; > > } > > > > ... > > > > return im; > > } > > > > Let's say the base is unsupported (e.g., -42), and endptr initially > > points to nptr-1. Imagine this call: > > > > i = strtoimax(p + 1, &p, -42); > > > > ISO C doesn't specify what happens if the base is not between 0 and 36, > > so the behavior is probably undefined in ISO C. > > > > POSIX says it returns 0 and sets errno to EINVAL, but doesn't say what > > happens to endptr. I expect two possible implementations: > > > > - Leave endptr untouched. > > - Set *endptr = nptr. > > > > Let's suppose it leaves endptr untouched (otherwise, it would be > > impossible to portably differentiate an EINVAL due to unsupported base > > from an EINVAL due to no digits in the string). > > > > So, the test (nptr == *endptr) would be false (because p+1 != p), and > > the code would jump into accessing **endptr without having derived > > that pointer from nptr, which is a violation of restrict. > > Oops, it's within an (errno == 0) path, so *endptr is guaranteed to be > derived from nptr here. > > So no bug, but still unclear to me what's the benefit of using restrict, The section "7. Library" at [1] has some information about the 'restrict' keyword. I think the restrict keywords compel the programmer to keep the string (or that portion of the string that strtol actually accesses) and the pointer to a string in non-overlapping memory regions. Calling strtol(p, &p, 0) should be well-defined in such cases. ------------------- [1] https://www.open-std.org/jtc1/sc22/wg14/www/docs/n881.pdf -Amol > and also unclear why GCC doesn't warn about it at call site. > > > I made many assumptions here, where the standards are not clear, so I > > may be wrong in some of them. But it looks to me like a bug. > > > > CCing libbsd. > > > > Cheers, > > Alex > > > > -- > > <https://www.alejandro-colomar.es/> > > > > -- > <https://www.alejandro-colomar.es/>