Hi Martin, On 9/3/22 14:47, Martin Uecker wrote: [...]
GCC will warn if the bound is specified inconsistently between declarations and also emit warnings if it can see that a buffer which is passed is too small: https://godbolt.org/z/PsjPG1nv7
That's very good news!BTW, it's nice to see that GCC doesn't need 'static' for array parameters. I never understood what the static keyword adds there. There's no way one can specify an array size an mean anything other than requiring that, for a non-null pointer, the array should have at least that size.
BTW: If you declare pointers to arrays (not first elements) you can get run-time bounds checking with UBSan: https://godbolt.org/z/TvMo89WfP
Couldn't that be caught at compile time? n is certainly out of bounds always for such an array, since the last element is n-1.
Also, new code can be designed from the beginning so that sizes go before their corresponding arrays, so that new code won't typically be affected by the lack of this feature in the language. This leaves us with legacy code, especially libc, which just works, and doesn't have any urgent needs to change their prototypes in this regard (they could, to improve static analysis, but not what we'd call urgent).It would be useful step to find out-of-bounds problem in applications using libc.
Yep, it would be very useful for that. Not urgent, but yes, very useful.
Let's take an example: int getnameinfo(const struct sockaddr *restrict addr, socklen_t addrlen, char *restrict host, socklen_t hostlen, char *restrict serv, socklen_t servlen, int flags); and some transformations: int getnameinfo(const struct sockaddr *restrict addr, socklen_t addrlen, char host[restrict hostlen], socklen_t hostlen, char serv[restrict servlen], socklen_t servlen, int flags); int getnameinfo(socklen_t hostlen; socklen_t servlen; const struct sockaddr *restrict addr, socklen_t addrlen, char host[restrict hostlen], socklen_t hostlen, char serv[restrict servlen], socklen_t servlen, int flags); (I'm not sure if I used correct GNU syntax, since I never used that extension myself.) The first transformation above is non-ambiguous, as concise as possible, and its only issue is that it might complicate the implementation a bit too much. I don't think forward-using a parameter's size would be too much of a parsing problem for human readers.I personally find the second form not terrible. Being able to read code left-to-right, top-down is helpful in more complicated examples.The second one is unnecessarily long and verbose, and semicolons are not very distinguishable from commas, for human readers, which may be very confusing. int foo(int a; int b[a], int a); int foo(int a, int b[a], int o); Those two are very different to the compiler, and yet very similar to the human eye. I don't like it. The fact that it allows for simpler compilers isn't enough to overcome the readability issues.This is true, I would probably use it with a comma and/or syntax highlighting.I think I'd prefer having the forward-using syntax as a non-standard extension --or a standard but optional language feature-- to avoid forcing small compilers to implement it, rather than having the GNU extension standardized in all compilers.The problems with the second form are: - it is not 100% backwards compatible (which maybe ok though) as the semantics of the following code changes: int n; int foo(int a[n], int n); // refers to different n! Code written for new compilers could then be misunderstood by old compilers when a variable with 'n' is in scope.
Hmmm, this one is serious. I can't seem to solve it with that syntax.
- it would generally be fundamentally new to C to have backwards references and parser might need to be changes to allow this - a compiler or tool then has to deal also with ugly corner cases such as mutual references: int foo(int (*a)[sizeof(*b)], int (*b)[sizeof(*a)]); We could consider new syntax such as int foo(char buf[.n], int n); Personally, I would prefer the conceptual simplicity of forward declarations and the fact that these exist already in GCC over any alternative. I would also not mind new syntax, but then one has to define the rules more precisely to avoid the aforementioned problems.
What about taking something from K&R functions for this?: int foo(q; w; int a[q], int q, int s[w], int w); By not specifying the types, the syntax is again short.This is left-to-right, so no problems with global variables, and no need for complex parsers. Also, by not specifying types, now it's more obvious to the naked eye that there's a difference:
int foo(a; int b[a], int a); int foo(int a, int b[a], int o); What do you think about this syntax? Thanks, Alex -- Alejandro Colomar <http://www.alejandro-colomar.es/>
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature