Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parameters

Alejandro Colomar <alx.manpages@xxxxxxxxx> · Sat, 27 Aug 2022 20:38:10 +0200

Hi Ingo,

On 8/27/22 15:08, Ingo Schwarze wrote:
[...]

      void *memmove(void *dest, const void *src, size_t n);

      void *memcpy(void *restrict dest, const void *restrict src,
                   size_t n);

Actually, the syntax of both is identical, only the semantics differ.

That said, you are right that using memcpy(3) when memmove(3) is
required is a famous and widespread bug.  I doubt putting "restrict"
into the SYNOPSIS will discourage careless programmers from making that
mistake though.

There will always be completely careless programmers, and I won't 
attempt to target those.  But I hope to increase the percentage of 
population that receives the message with this change.

There's a lot of good programmers out there that ignore what we'd 
probably consider basic stuff.  I've seen very successful programmers 
using pointer types to store offsets, where ptrdiff_t should be used; 
and of course the code is full of casts (including to uintptr_t) to 
silence the hundreds of warnings from the compiler yelling at that 
blasphemy (3 casts in a line of code that should be just 'q = p + offset;').

I don't think it's only carelessness (okay, there's a bit of it), but 
more that some programmers still live in the past century, and don't 
know that 'const', and more recently 'restrict' were added to the 
language.  Did they live in a cave?  So it seems.

To me, "restrict" feels like a specialized tool for people writing
compiler optimizers, not like something important enough to clutter
API documenation.

Do you regard the (abused) VLA syntax as something much worse than the
use of restrict?  Or are they more or less equivalent to you?

If your implementation really contains "restrict" in the header
file and it's standardized, putting it into the SYNOPSIS seems
acceptable to me.  Not necessary though and maybe somewhat noisy
and distracting.

I would document restrict even if glibc ommited it from their 
prototypes, because the manual pages document the behavior, and are not 
required to be uninformative when the implementation is.  memcpy(3) 
requires restrict pointers, even if the implementation doesn't advertise 
it.  The run-time behavior will produce bugs if the pointers aren't 
restrict, and so the documentation better tells that.

Should that only be limited to the DESCRIPTION?  Maybe.  I don't like 
that idea, when we have the language to express that more precisely and 
concisely.

What is the usefulness of a prototype that is as short as possible?
Okay, it's less noisy, but at the cost of giving less information to an 
interested reader.

    void *memcpy(void dest[restrict n], const void src[restrict n],
                 size_t n);

    void *memcpy(void *restrict dest, const void *restrict src,
                 size_t n);

    void *memcpy(void *dest, const void *src, size_t n);

    void *memcpy(void *dest, void *src, size_t n);

    void *memcpy(void *, void *, size_t);

Where do we stop?  Okay, I was abusing too much in the first one.  I 
didn't dare to propose that, since VLA syntax with void is, right now, a 
shooting offense.  GNU C allows pointer arithmetic on 'void *', but for 
some reason, when they are disguised as false arrays it is not allowed 
to compile.  Give it some decades, though.  I think it would be useful.

These prototypes give several levels of information, from most basic, to 
most precise:

- number of arguments.
- type of the arguments.
- small description of what they mean (through the names).
- are pointers only used for reading, or are the pointees also modified?
- can pointers point to the same storage?
- how many elements/bytes has the storage pointed to by pointers?

And now yet in those prototypes, but I'd also like to give information 
about if pointers are allowed to be null or not.  I'm still not 
convinced about how to document that.

Why draw the line of this is useful and this is noise in a specific 
point?  The description could perfectly document the const-ness of a 
parameter as well as it documents the restrict-ness or the null-ness.  I 
think that having the prototypes be concise is good, but the overall 
goal is to have the whole manual page concise.  If adding a little bit 
to the prototypes makes the description much more concise (since it 
doesn't need to document const-ness, and hopefully restrict-ness and 
null-ness or size, or at least it can be shorter about it since the 
prototype already tells you a big part), I'd go for it.

Putting something that is not in the implementation and/or not
in the standard into the SYNOPSIS seems much worse to me.

I have mixed feelings about this.

As you probably know by older threads, I don't like the standard as a 
driving force in C, so I don't like the idea that implementations and 
documentation should go behind the standard, using whatever the standard 
provides them with.  It's not completely like that, as I acknowledge 
that the standard has improved considerably the language and the 
library, and I like to use some of their improvements, or even things 
that it is yet considering (my own personal code I build it with 
-std=gnu2x, since it's just for me, so I don't care about portability, 
and want the most useful extensions, and I'm prepared to deal with 
compiler bugs); but I don't like the standard as the _only_ driving force.

Said that, I think both GCC and glibc should not be intimidated by the 
standard when developing new extensions.  Okay, that is not a wildcard 
for releasing crap; but if a feature is good, that's fine.  Now, are the 
manual pages allowed to extend the language as well?  Of course not so 
much as the compiler or libc, but a little bit wouldn't hurt.  So, I 
wouldn't take your comment too strictly.

Still, I acknowledge this suggestion of mine is far more aggressive than 
most other trivial deviations from valid code.  Maybe I should keep the 
idea floating around, and suggest it again after the new standard is 
released, so that compiler writers are less stressed about it, and can 
consider such an extension.  I'll maybe talk to GCC maintainers about it 
and see what they think.

And invalid syntax in the SYNOPSIS is even worse than that.
For example, people may attempt to use SYNOPSIS as an example
when designing their own, private function for a similar but
not identical purpose and end up writing non-portable code,
or even code that does not compile anywhere.

They may be wrong if they blame you for that, but i doubt they
will thank you.

Yeah, that's something important to consider.

They should certainly not show something imaginary
that does not match reality, and even less so using invalid syntax.

Well, not that I haven't had those thoughts, but we already use ilegal
syntax in some cases for good reasons.  See for example open(2):

         int open(const char *pathname, int flags);
         int open(const char *pathname, int flags, mode_t mode);

Of course, you can't declare two conflicting prototypes like that.

This does not seem quite as horrifying as

   char *getcwd(char buf[size], size_t size);

because at least each of the prototypes is valid.

My main concern about it would be that it is likely to make some people
think (and C++ programmers in particular :-/) that there is type
checking for the third and subsequent arguments, in which case they
will be unpleasantly surprised when accidentally writing something like

   open(pathname, flags, &some_var, mode);

and finding out later that it compiled and ran just fine, but the
resulting file wasn't quite as confidential as they hoped.

Explicitly displaying the ... to indicate the variable number of
arguments, by contrast, makes it very clear that an API is almost
certainly unusually dangerous and needs to be used with especial
diligence.

Yeah, I suggested Michael using '...' and adding in a comment:

int open(const char *pathname, int flags, ... /* mode_t mode */);

He agreed, but we were doing something else, and then I didn't ask 
again, so this change didn't make it.  If you recommend me doing it, 
I'll do.

Either way, certainly not quite as bad as invalid syntax inside
a prototype...

Cheers,

Alex

Yours,
   Ingo

--
Alejandro Colomar
<http://www.alejandro-colomar.es/>
Attachment:
OpenPGP_signature

Description: OpenPGP digital signature