Hi Branden! On 1/20/23 18:55, G. Branden Robinson wrote:
[re-ordering the mail I'm quoting] Hi Alex, I have some observations on your deprecation initiative and people's reactions to it.
Sure :)
At 2023-01-20T14:12:07+0100, Alejandro Colomar wrote:All implementations of sscanf(3) produce Undefined Behavior (UB), AFAIK. How much you consider UB to be a real-world issue differs for each programmer, but I tend to consider all UB to be as bad as nasal demons. I'm not saying UB shouldn't exist, just that you shouldn't invoke it. And a function that is used for scanning user input is one of those places where you really want to avoid invoking UB.If there are common idioms that result in UB, it might be worth documenting this in the man page, with a citation to the relevant clause of the standard that declares it thus.
Okay. See proposed diff below
I agree that UB is something to be avoided and I think most other programmers do too. The advantage to this approach is that if they disagree, they can take their argument to the standards body instead of litigating it with you.
:)
This is similar but different to bzero(3). bzero(3) was broken or slow in some implementations. That's probably why it was never added to ISO C, and why POSIX later removed it. The API wasn't bad, and in fact it's great, I prefer it over memset(3). The difference between bzero(3) and sscanf(3) is that bzero(3) has now been fixed,I still don't share your preference here. The exposure of a more general interface (memset) by a general-purpose library when the implementation otherwise has no additional implementation cost is the correct choice.
While I share your interest in general-purpose over specialized, and that's in essence the essence of Unix, I also believe that encapsulation is very necessary for writing readable code.
Your (and many others') proposal of having a project-specific macro for bzero(3) seems reasonable in absence of a standard name for it. However, having a POSIX-blessed (until recently) name for such an interface, I'd prefer sticking to it. Otherwise, we risk having bzero(), memzero(), zerobytes(), zero(), ... which is not crazy, but hey, I prefer less moving parts when reading code :)
As for removing from POSIX a function just because it's not generic... I have in mind a long list of such features that are equally trivial and unnecessary (and in some cases, they hurt unlike bzero(3), IMO), yet they haven't died. For a representative, let me present our friend:
printf(3)Oh boy, tell me it hurts your fingers writing fprintf(stdout, ) but not memset(, 0, ). At least with fprintf(3) it's obvious the ordering of the parameters and I don't need to check the man page.
If a given programmer's use cases are restricted such
It's not a single given programmer. memset(3) is likely to be the most obvious case where the thin wrapper is what you want to call. There are many uses for fprintf(3), there are many uses for other such functions that have a thin wrapper in the same libc, but memset(3)? How much you've (or any code you know) used it with something other than constant expression 0?
that one of the arguments to a general-purpose function is constant, then that is exactly the time for them to write a macro or function specific to their project to hide the complexity. If you tilt your head right, this is similar to one of the ways closures are used in other languages.
I'm fine with the function being implemented as a macro, although it would be better to have it as an inline function, so that -Os can produce smaller code if needed. In general, I don't like macros unless there's a need to avoid type conversions; for example for keeping arrays as arrays.
I could change the "deprecated" statements by "see bugs",I think you've hit upon one of the core drivers of resistance here. A problem with calling something "deprecated" is that it's often unstated _who_ is doing the deprecation. Traditionally, I think the Linux man-pages have tended only to use this term in reference to one of the standards bodies (WG14 or the Austin Group) formally employing it.
There are some pages which have single-handedly deprecated features with no standard or group doing so. I remember having seen a few pages do that, but they are all from prehistoric times, when standards didn't mean so much (or maybe there weren't such standards).
(Maybe I'm wrong, and Linux man-pages _has_ deprecated things in its own authorial voice...but if other people also don't know that, it doesn't matter, and confusion remains.)
Yes, they did. Well, confusion always happens when things change. I expect that to settle down. However, I'll try to improve my methods for deprecating broken stuff as much as I can so we can reduce the confusion.
So I suggest you adopt a new phrase, like "discouraged by Linux man-pages", to characterize the authorial voice here. Some people will ignore your advice either way, but at least they'll know who they're ignoring.[1]
I like deprecating. I want such a strong term. I'll try to clarify that it's the man-pages that do the deprecation, and not a standards body.
However, if somebody really wants to use that function, and would like to fix it, I encourage that effort. If the function is fixed, which shouldn't be that hard, I'm fine removing the messages against its usage in the manual. While that doesn't happen, I prefer strongly recommending against their usage in the manual. And dict(1) seems to say that the verb for that is "to deprecate" :)Your dictionary is correct but social knowledge, a.k.a. tradition and folklore, impose a context on the discussion. Sometimes dumb things become tradition (like calculating factorials or Fibonacci numbers with recursive functions[2])--we don't have to acquiesce to that, but we will have to document and sometimes defend our rejection of them.Right. memcpy(3) has a bug in the standard. However, implementations do the Right Thing (tm). If implementations did the right thing for sscanf(3), that would be enough to remove the recommendation against it. But my understanding is that the sscanf(3) implementation is not free of that problem.This is a good opportunity to say so in these terms. "Linux man-pages discourages use of sscanf [under the conditions XXX] until implementations are corrected to avoid undefined behavior [cite URL here]."[3] Regards, Branden [1] In groff_man(7), I admit I have not taken my own advice, and use the term "deprecated" in a subsection heading. I have two defenses for this. (1) I reorganized the man page along those lines 5-6 years ago, when I had less practice at writing technical documentation, and (2) the man(7) macros are not formally standardized anywhere anyway. There is no "official" body with which to conflict, or with whom groff can be confused by the reader. After groff 1.23 is released (good news, I heard from Bertrand last weekend)
Nice :)
I hope to add the SunOS extension "SB" to the deprecation list now that Solaris's death seems irreversible. [2] https://sleeplessafternoon.wordpress.com/2013/03/26/examples-of-recursion-the-good-the-bad-and-the-silly/ For the mathematically or algorithmically inclined, I also recommend "The Genuine Sieve of Eratosthenes", by Melissa O'Neill. https://www.cs.hmc.edu/~oneill/papers/Sieve-JFP.pdf [3] groff_man(7) gives you UR/UE, so use them! >:-)
How about the following? Cheers, Alex --- diff --git a/man3/sscanf.3 b/man3/sscanf.3 index 26a02521b..870c6f54b 100644 --- a/man3/sscanf.3 +++ b/man3/sscanf.3 @@ -653,6 +653,25 @@ .SS The 'a' assignment-allocation modifier .I gcc\~\-std=c99 etc.). .SH BUGS +.SS Numeric conversion specifiers +Use of the numeric conversion specifiers produces Undefined Behavior +for invalid input. +See +.UR https://port70.net/\:%7Ensz/\:c/\:c11/\:n1570.html\:#7.21.6.2p10 +C11 7.21.6.2/10 +.UE . +This is a bug in the ISO C standard, +and not an inherent design issue with the API. +However, +current implementations are not safe from that bug, +so it is not recommended to use them. +Instead, +programs should use functions such as +.BR strtol (3) +to parse numeric input. +This manual page deprecates use of the numeric conversion specifiers +until they are fixed by ISO C. +.SS Nonstandard modifiers These functions are fully C99 conformant, but provide the additional modifiers .B q -- <http://www.alejandro-colomar.es/>
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature