Hi JeanHeyd,
Subject: Re: [PATCH] Various pages: SYNOPSIS: Use VLA syntax in function parametersDate: Fri, 2 Sep 2022 16:56:00 -0400 From: JeanHeyd Meneide <wg14@xxxxxxxxxx> To: Alejandro Colomar <alx.manpages@xxxxxxxxx> CC: Ingo Schwarze <schwarze@xxxxxxx>, linux-man@xxxxxxxxxxxxxxx Hi Alejandro and Ingo,Just chiming in from a Standards perspective, here. We discussed, briefly, a way to allow Variable-Length function parameter declarations like the ones shown in this thread (e.g., char *getcwd(char buf[size], size_t size );).In GCC, there is a GNU extension that allows explicitly forward-declaring the prototype. Using the above example, it would look like so:
I added the GCC list to the thread, so that they can intervene if they consider it necessary.
char *getcwd(size_t size; char buf[size], size_t size);
I read about that, although I don't like it very much, and never used it.
(Live Example [1])(Note the `;` after the first "size" declaration). This was brought before the Committee to vote on for C23 in the form of N2780 [2], around the January 2022 timeframe. The paper did not pass, and it was seen as a "failed extension". After the vote on that failed, we talked about other ways of allowing places whether there was some appetite to allow "forward parsing" for this sort of case. That is, could we simply allow:char *getcwd(char buf[size], size_t size);to work as expected. The vote for this did not gain full consensus either, but there were a lot of abstentions [3]. While I personally voted in favor of allowing such for C, there was distinct worry that this would produce issues for weaker C implementations that did not want to commit to delayed parsing or forward parsing of the entirety of the argument list before resolving types. There are enough abstentions during voting that a working implementation with a writeup of complexity would sway the Committee one way or the other.
I like that this got less hate than the GNU extension. It's nicer to my eyes.
This is not to dissuade Alejandro's position, or to bolster Ingo's point; I'm mostly just reporting the Committee's response here. This is an unsolved problem for the Committee, and also a larger holdover from the removal of K&R declarations from C23, which COULD solve this problem:// decl char *getcwd(); // impl char* getcwd(buf, size) char buf[size]; size_t size; { /* impl here */ }
I won't miss them ;)My regex-based parser[1] that finds declarations and definitions in C code bases goes nuts with K&R functions. They are dead for good :)
[1]: <http://www.alejandro-colomar.es/src/alx/alx/grepc.git/>
There is room for innovation here, or perhaps bolstering of the GCC original extension. As it stands right now, compilers only very recently started taking Variably-Modified Type parameters and Static Extent parameters seriously after carefully separating them out of Variable-Length Arrays, warning where they can when static or other array parameters do not match buffer lengths and so-on.Not just to the folks in this thread, but to the broader community for anyone who is paying attention: WG14 would actively like to solve this problem. If someone can:- prove out a way to do delayed parsing that is not implementation-costly, - revive the considered-dead GCC extension, or - provide a 3rd or 4th way to support the goals,I am certain WG14 would look favorably upon such a thing eventually, brought before the Committee in inclusion for C2y/C3a.Whether or not you feel like the manpages are the best place to start that, I'll leave up to you!
I'll try to defend the reasons to start this in the man-pages.This feature is mostly for documentation purposes, not being meaningful for code at all (for some meaning of meaningful), since it won't change the function definition in any way, nor the calls to it. At least not by itself; static analysis may get some benefits, though.
Also, new code can be designed from the beginning so that sizes go before their corresponding arrays, so that new code won't typically be affected by the lack of this feature in the language.
This leaves us with legacy code, especially libc, which just works, and doesn't have any urgent needs to change their prototypes in this regard (they could, to improve static analysis, but not what we'd call urgent).
And since most people don't go around reading libc headers searching for function declarations (especially since there are manual pages that show them nicely), it's not like the documentation of the code depends on how the function is _actually_ declared in code (that's why I also defended documenting restrict even if glibc wouldn't have cared to declare it), but it depends basically on what the manual pages say about the function. If the manual pages say a function gets 'restrict' params, it means it gets 'restrict' params, no matter what the code says, and if it doesn't, the function accepts overlapping pointers, at least for most of the public (modulo manual page bugs, that is).
So this extension could very well be added by the manual pages, as a form of documentation, and then maybe picked up by compilers that have enough resources to implement it.
Considering that this feature is mostly about documentation (and a bit of static analysis too), the documentation should be something appealing to the reader.
Let's take an example: int getnameinfo(const struct sockaddr *restrict addr, socklen_t addrlen, char *restrict host, socklen_t hostlen, char *restrict serv, socklen_t servlen, int flags); and some transformations: int getnameinfo(const struct sockaddr *restrict addr, socklen_t addrlen, char host[restrict hostlen], socklen_t hostlen, char serv[restrict servlen], socklen_t servlen, int flags); int getnameinfo(socklen_t hostlen; socklen_t servlen; const struct sockaddr *restrict addr, socklen_t addrlen, char host[restrict hostlen], socklen_t hostlen, char serv[restrict servlen], socklen_t servlen, int flags);(I'm not sure if I used correct GNU syntax, since I never used that extension myself.)
The first transformation above is non-ambiguous, as concise as possible, and its only issue is that it might complicate the implementation a bit too much. I don't think forward-using a parameter's size would be too much of a parsing problem for human readers.
The second one is unnecessarily long and verbose, and semicolons are not very distinguishable from commas, for human readers, which may be very confusing.
int foo(int a; int b[a], int a); int foo(int a, int b[a], int o);Those two are very different to the compiler, and yet very similar to the human eye. I don't like it. The fact that it allows for simpler compilers isn't enough to overcome the readability issues.
I think I'd prefer having the forward-using syntax as a non-standard extension --or a standard but optional language feature-- to avoid forcing small compilers to implement it, rather than having the GNU extension standardized in all compilers.
Having this extension in any single compiler would even make it more appealing to manual pages, which could use the syntax more freely without fear of confusing readers. Even if the standard wouldn't accept it.
Let's see if GCC likes the feature and helps me attempt to use it a little bit! :-)
Cheers, Alex -- Alejandro Colomar <http://www.alejandro-colomar.es/>
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature