[CC += GCC] // pun not intended :P Hi Zack, On 1/20/23 19:04, Zack Weinberg wrote:
On Fri, Jan 20, 2023, at 8:40 AM, Alejandro Colomar wrote:The historical design of `sockaddr_storage` makes it impossible to use without breaking strict aliasing rules. Not only this type is unusable, but even the use of other `sockaddr_*` structures directly (when the programmer only cares about a single address family) is also incorrect, since at some point the structure will be accessed as a `sockaddr`, and that breaks strict aliasing rules too. So, the only way for a programmer to not invoke Undefined Behavior is to declare a union that includes `sockaddr` and any `sockaddr_*` structures that are of interest, which allows later accessing as either the correct structure or plain `sockaddr` for the sa_family....struct new_sockaddr_storage nss; // ... (initialize oss and nss) inet_sockaddr2str(&nss.sa); // correct (and has no casts)I think we need to move slowly here and be _sure_ that any proposed change does in fact reduce the amount of UB.
Sure, I'm just sending the patch to polish the idea around some concrete code. While I checked that it compiles, I didn't add any tests about it or anything, to see that it's usable (and Joseph's email showed me that it's far from being finished). I expect that this'll take some time.
This construct, in particular, might not actually be correct in practice: see https://godbolt.org/z/rn51cracn for a case where, if I'm reading it right, the compiler assumes that a write through a `struct fancy *` cannot alter the values accessible through a `struct simple *` even though both pointers point into the same union. (Test case provided by <https://stackoverflow.com/users/363751/supercat>;
I @supercat around? Maybe you want to open a question on SO (or Codidact) and we can discuss it there; it would be interesting. :)
About the program, I have doubts. It's more or less what I asked on Codidact yesterday[1]. I can't find anything in the standard to support GCC's behavior, and am inclined to think that it's a compiler bug. @Lundin's answer[2] seems reasonable. But still, my understanding until yesterday seemed to be in line with the compiler behavior that you showed: the compiler sees a pointer to a different type, and assumes things.
I believe it's fine according to the common initial sequence rule in C11::6.5.2.3/6 <https://port70.net/%7Ensz/c/c11/n1570.html#6.5.2.3p6> in combination with the pointer conversion rule in C11::6.3.2.3/7 <https://port70.net/%7Ensz/c/c11/n1570.html#6.3.2.3p7>.
The test you showed in godbolt shows two behaviors: sometimes it prints 1/3 (this is correct; it happens for high values of -O) and sometimes it prints 3/3 (this is invalid; it happens for low values of -O). BTW, I'm a bit surprised that GCC optimizes out the local in -O0 or even -Og.
What I guess that GCC is doing in the invalid case is that in main() it sees that myThing is modified in vtest(), and then the structure is only used through a pointer to a different member, which according to 6.5.2.3/6 is only allowed for reading but not writing, so it assumes that loading the structure again at printf() will be fine, so it optimizes out the local. However, GCC didn't take into account that the pointer can later be casted back to the right member, and that way it's allowed to modify it, according to 6.3.2.3/7.
I had dropped GCC from the CC in the email to not spam them too much, but since this concerns them, I'll keep them in the loop from now on.
[1]: <https://software.codidact.com/posts/287748> [2]: <https://software.codidact.com/posts/287748/287750#answer-287750>
I don't know any other identifier for them.) zw
Cheers, Alex -- <http://www.alejandro-colomar.es/>
Attachment:
OpenPGP_signature
Description: OpenPGP digital signature