Fwd: [PATCH] sscanf.3: Remove term 'deprecated', and expand BUGS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



(repost to mailing list, as my previous message attempt looked like
plain-text but was actually html)



> Hi Lee!

> Thanks for the report.  After seeing how much frustration it has caused,
> I propose this change.  Does it look good to you?

I don't wish to bike-shed this (as the current man-page is fine by me)
and I have no idea on the style guide used by the man-pages, but if I
was making the change I would replace the 'deprecated' on every
integer specifier with "CAVEAT: SEE BUGS". That way the inexperienced
reader is still frightened into using the function carefully. But if
that kind of thing isn't allowed then the proposed patch looks good to
me.

As a general point: A _lot_ of inexperienced users use this function
to parse user input. At the start of every semester you see an influx
of "why is my use of scanf broken?" posts on the various C and
learn-programming based subreddits, as well as Stackoverflow. I have
no idea why but it seems there's a large body of professors out there
teaching people to use scanf() instead of getc() or fgets() etc, so
I'm of the opinion that the scanf() page needs to be as scary as
possible :)

Again, I know nothing about how man pages are written, but if it was
documentation for legacy code I'd inherited I'm make sure to stress
the following somewhere on the page:

1. scanf() is intended to parse FORMATTED input, i.e. it consumes the
kind of strings produced by printf(), and NOT user input. (I'm not
100% sure if K&R had that as their rationale, but that's the way it's
designed now. Though this might confuse people into thinking they can
use their similar, but not identical, format strings between printf
and scanf!). Currently the word "format" or "formatted" barely
appears. But it's this feature that distinguishes it from the other
parsing functions.
2. Things like fgets() are much better for consuming user input, which
you can then parse with all the other functions.

Thanks,
Lee Griffiths


On Wed, 6 Dec 2023 at 14:52, Alejandro Colomar <alx@xxxxxxxxxx> wrote:
>
> Several programmers have been confused about this use of 'deprecated'.
>
> Also, maximum field width can be used with these fields to mitigate the
> problem.  Still, it's only a mitigation, since it limits the number of
> characters read, but that means an input of LONG_MAX+1 --which takes up
> the same number of characters than LONG_MAX-- would still cause UB; or
> one can limit that to well below the limit of UB, but then you
> artificially invalidate valid input.  No good way to avoid UB with
> sscanf(3), but it's not necessarily bad with trusted input (and
> strtol(3) isn't the panacea either; strtoi(3) is good, though, but not
> standard).
>
> Try to be more convincing in BUGS instead.
>
> Link: <https://stackoverflow.com/questions/77601832/man-sscanf-d-is-deprecated-in-c-or-glibc/>
> Cc: Lee Griffiths <poddster@xxxxxxxxx>
> Cc: Zack Weinberg <zack@xxxxxxxxxxxx>
> Signed-off-by: Alejandro Colomar <alx@xxxxxxxxxx>
> ---
>
> Hi Lee!
>
> Thanks for the report.  After seeing how much frustration it has caused,
> I propose this change.  Does it look good to you?
>
> Thanks,
> Alex
>
>  man3/sscanf.3 | 15 ++-------------
>  1 file changed, 2 insertions(+), 13 deletions(-)
>
> diff --git a/man3/sscanf.3 b/man3/sscanf.3
> index 2211cab7d..4c0bdc318 100644
> --- a/man3/sscanf.3
> +++ b/man3/sscanf.3
> @@ -359,7 +359,6 @@ .SS Conversions
>  and assignment does not occur.
>  .TP
>  .B d
> -.IR Deprecated .
>  Matches an optionally signed decimal integer;
>  the next pointer must be a pointer to
>  .IR int .
> @@ -374,7 +373,6 @@ .SS Conversions
>  .\" is silently ignored, causing old programs to fail mysteriously.)
>  .TP
>  .B i
> -.IR Deprecated .
>  Matches an optionally signed integer; the next pointer must be a pointer to
>  .IR int .
>  The integer is read in base 16 if it begins with
> @@ -387,18 +385,15 @@ .SS Conversions
>  Only characters that correspond to the base are used.
>  .TP
>  .B o
> -.IR Deprecated .
>  Matches an unsigned octal integer; the next pointer must be a pointer to
>  .IR "unsigned int" .
>  .TP
>  .B u
> -.IR Deprecated .
>  Matches an unsigned decimal integer; the next pointer must be a
>  pointer to
>  .IR "unsigned int" .
>  .TP
>  .B x
> -.IR Deprecated .
>  Matches an unsigned hexadecimal integer
>  (that may optionally begin with a prefix of
>  .I 0x
> @@ -409,33 +404,27 @@ .SS Conversions
>  .IR "unsigned int" .
>  .TP
>  .B X
> -.IR Deprecated .
>  Equivalent to
>  .BR x .
>  .TP
>  .B f
> -.IR Deprecated .
>  Matches an optionally signed floating-point number; the next pointer must
>  be a pointer to
>  .IR float .
>  .TP
>  .B e
> -.IR Deprecated .
>  Equivalent to
>  .BR f .
>  .TP
>  .B g
> -.IR Deprecated .
>  Equivalent to
>  .BR f .
>  .TP
>  .B E
> -.IR Deprecated .
>  Equivalent to
>  .BR f .
>  .TP
>  .B a
> -.IR Deprecated .
>  (C99) Equivalent to
>  .BR f .
>  .TP
> @@ -661,8 +650,8 @@ .SS Numeric conversion specifiers
>  programs should use functions such as
>  .BR strtol (3)
>  to parse numeric input.
> -This manual page deprecates use of the numeric conversion specifiers
> -until they are fixed by ISO C.
> +Alternatively,
> +mitigate it by specifying a maximum field width.
>  .SS Nonstandard modifiers
>  These functions are fully C99 conformant, but provide the
>  additional modifiers
> --
> 2.42.0
>




[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux