Hi Zack, On Wed, Dec 06, 2023 at 03:52:34PM +0100, Alejandro Colomar wrote: > Several programmers have been confused about this use of 'deprecated'. > > Also, maximum field width can be used with these fields to mitigate the > problem. Still, it's only a mitigation, since it limits the number of > characters read, but that means an input of LONG_MAX+1 --which takes up > the same number of characters than LONG_MAX-- would still cause UB; or > one can limit that to well below the limit of UB, but then you > artificially invalidate valid input. No good way to avoid UB with > sscanf(3), but it's not necessarily bad with trusted input (and > strtol(3) isn't the panacea either; strtoi(3) is good, though, but not > standard). > > Try to be more convincing in BUGS instead. > > Link: <https://stackoverflow.com/questions/77601832/man-sscanf-d-is-deprecated-in-c-or-glibc/> > Cc: Lee Griffiths <poddster@xxxxxxxxx> > Cc: Zack Weinberg <zack@xxxxxxxxxxxx> > Signed-off-by: Alejandro Colomar <alx@xxxxxxxxxx> > --- > > Hi Lee! > > Thanks for the report. After seeing how much frustration it has caused, > I propose this change. Does it look good to you? > > Thanks, > Alex Formatted page: BUGS Numeric conversion specifiers Use of the numeric conversion specifiers produces Undefined Be‐ havior for invalid input. See C11 7.21.6.2/10. This is a bug in the ISO C standard, and not an inherent design issue with the API. However, current implementations are not safe from that bug, so it is not recommended to use them. Instead, programs should use functions such as strtol(3) to parse numeric input. This manual page deprecates use of the numeric conversion speci‐ fiers until they are fixed by ISO C. I think it would be good if glibc would make promises about sscanf(3) on untrusted input. How about guaranteeing a value of -1 and ERANGE if the integer would overflow? The current implementation, AFAIK, uses strtol(3), so it has the following behavior: - For %d, if the value is >INT_MAX but <=LONG_MAX, the wrap-around value is stored, and errno is not set. - For %d, if the value is >LONG_MAX, -1 is stored, and errno is set. $ cat sscanf.c #define _GNU_SOURCE #include <errno.h> #include <stdio.h> #include <string.h> #define wrap(s) do \ { \ int i, ret; \ \ errno = 0; \ ret = sscanf(s, "%d", &i); \ printf("%s: ret= %d, val= %d, errno= %s\n", #s , ret, i, strerrorname_np(errno)); \ } while (0) int main(void) { char str_a[] = "9223372036854775828"; // 2^63 + 20 char str_s[] = "8589934599"; // 2^33 + 7 char str_d[] = "4294967290"; // 2^32 - 6 char str_f[] = "2147483678"; // 2^31 + 30 char str_g[] = "2147483638"; // 2^31 - 10 wrap(str_a); wrap(str_s); wrap(str_d); wrap(str_f); wrap(str_g); } $ cc -Wall -Wextra sscanf.c $ ./a.out str_a: ret= 1, val= -1, errno= ERANGE str_s: ret= 1, val= 7, errno= 0 str_d: ret= 1, val= -6, errno= 0 str_f: ret= 1, val= -2147483618, errno= 0 str_g: ret= 1, val= 2147483638, errno= 0 The suggested change would be to act as if strtoi(str, NULL, 0, INT_MIN, INT_MAX, &err); would have been called. Does that make sense to you? Also, I was going to ask for strtoi(3bsd) in glibc, since strtol(3) isn't easy to use portably (since POSIX allows EINVAL on no conversion, how to differentiate strtoi(3bsd)'s ECANCELED from EINVAL in strtol(3)?). Thanks, Alex -- <https://www.alejandro-colomar.es/>
Attachment:
signature.asc
Description: PGP signature