On Thu, Mar 30, 2023 at 07:13:11PM +0200, Alejandro Colomar wrote: > POSIX.1 Issue 8 will fix the long-standing issue with sockaddr APIs, > which inevitably caused UB either on user code, libc, or more likely, > both. sockaddr_storage has been clarified to be implemented in a manner > that aliasing it is safe (suggesting a unnamed union, or other compiler > magic). > > Link: <https://www.austingroupbugs.net/view.php?id=1641> > Reported-by: Bastien Roucariès <rouca@xxxxxxxxxx> > Reported-by: Alejandro Colomar <alx@xxxxxxxxxx> > Cc: glibc <libc-alpha@xxxxxxxxxxxxxx> > Cc: GCC <gcc@xxxxxxxxxxx> > Cc: Eric Blake <eblake@xxxxxxxxxx> > Cc: Stefan Puiu <stefan.puiu@xxxxxxxxx> > Cc: Igor Sysoev <igor@xxxxxxxxx> > Cc: Rich Felker <dalias@xxxxxxxx> > Cc: Andrew Clayton <andrew@xxxxxxxxxxxxxxxxxx> > Cc: Richard Biener <richard.guenther@xxxxxxxxx> > Cc: Zack Weinberg <zack@xxxxxxxxxxxx> > Cc: Florian Weimer <fweimer@xxxxxxxxxx> > Cc: Joseph Myers <joseph@xxxxxxxxxxxxxxxx> > Cc: Jakub Jelinek <jakub@xxxxxxxxxx> > Cc: Sam James <sam@xxxxxxxxxx> > Signed-off-by: Alejandro Colomar <alx@xxxxxxxxxx> > --- > > Hi all, > > This is my proposal for documenting the POSIX decission of fixing the > definition of sockaddr_storage. Bastien, I believe you had something > similar in mind; please review. Eric, thanks again for the fix! Could > you please also have a look at this? > > Cheers, > > Alex > > man3type/sockaddr.3type | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/man3type/sockaddr.3type b/man3type/sockaddr.3type > index 32c3c5bd0..d1db87d5d 100644 > --- a/man3type/sockaddr.3type > +++ b/man3type/sockaddr.3type > @@ -23,6 +23,14 @@ .SH SYNOPSIS > .PP > .B struct sockaddr_storage { > .BR " sa_family_t ss_family;" " /* Address family */" > +.PP > +.RS 4 > +/* This structure is not really implemented this way. It may be > +\& implemented with an unnamed union or some compiler magic to > +\& avoid breaking aliasing rules when accessed as any other of the > +\& sockaddr_* structures documented in this page. See CAVEATS. > +\& */ Do we want similar comments in struct sockaddr and/or sockaddr_XX? > +.RE > .B }; > .PP > .BR typedef " /* ... */ " socklen_t; > @@ -122,6 +130,20 @@ .SH NOTES > .I <netinet/in.h> > and > .IR <sys/un.h> . > +.SH CAVEATS > +To avoid breaking aliasing rules, > +programs that use functions that receive pointers to > +.I sockaddr > +structures should declare objects of type > +.IR sockaddr_storage , > +which is defined in a way that it > +can be accessed as any of the different structures defined in this page. > +Failure to do so may result in Undefined Behavior. Existing POSIX already requires sockaddr_storage to be suitably sized and aligned to overlay with all other sockaddr* types. What the recent POSIX bug change does is add wording to emphasize that casts in any of the 6 directions: sockaddr* <-> sockaddr_XX* sockaddr_storage* <-> sockaddr* sockaddr_storage* <-> sockaddr_XX* must allow the sa_family/ss_family/sa_family_t member to overlay without triggering undefined behavior due to bad aliasing, at which point, access to that member lets you deduce what other object type you really have. But you are also correct that merely casting a pointer to another larger struct that doesn't trigger aliasing, but then dereferencing beyond the bounds of the original, is not intended to be portable. The aliasing diagnostics are suppressed because of the requirements on the first member, so now the user must now be careful that their access of remaining members is safe even if the compiler is no longer helping them because of the magic that suppressed the aliasing detection. I agree with your warning that code that can handle generic socket types should use sockaddr_storage (and not sockaddr) as the original object (the one object that the standard requires to be suitably sized and aligned to overlay with the entirety of all other sockaddr types, rather than just the sa_family_t first member), although we may want to be more precise that code using a specific protocol type can directly use the proper sockaddr_XX type rather than having to use an intermediate sockaddr_storage. I'm not sure if there are better ways to word that paragraph to convey the intended sentiment. > +.PP > +New functions should be written to accept pointers to > +.I sockaddr_storage > +instead of the traditional > +.IR sockaddr . I'm less certain about this one. The POSIX wording specifically chose to keep existing API/ABI of sockaddr* in all the standardized functions unchanged, as it would be too invasive to existing code to change the signatures now. The burden is on the system headers to define types so that the necessary casts (present in lots of existing code because sockaddr* has a bit more type-safety than void*) do not of themselves cause aliasing issues, and therefore avoid undefined behavior provided subsequent code accessing through the pointers is not accessing beyond the bounds of the real object. The likelihood of POSIX adding new socket APIs taking sockaddr_storage* just to enforce non-aliasing seems slim. But then again, this advice applies to more than just functions likely to be standardized in a future libc, so maybe this paragraph is worth it after all. > .SH SEE ALSO > .BR accept (2), > .BR bind (2), > -- > 2.39.2 > -- Eric Blake, Principal Software Engineer Red Hat, Inc. +1-919-301-3266 Virtualization: qemu.org | libvirt.org