Users reported audio breakage for 32-bit pulse clients connected to a 64-bit server over memfds. Investigating the issue further, the problem is twofold: 1. iochannel's file-descriptor passing code is liberal in what it issues: produced ancillary data object's "data" section exceeds length field. How such an extra space is handled is a grey area in the POSIX.1g spec, the IETF RFC #2292 "Advanced Sockets API for IPv6" memo, and the cmsg(3) manpage. 2. A 64-bit kernel handling of such extra space differs by whether the app is 64-bit or 32-bit. For 64-bit apps, the kernel smartly ducks the issue. For 32-bit apps, an -EINVAL is directly returned; that's due to a kernel CMSG header traversal bug in the networking stack "32-bit sockets emulation layer". Compare Linux Kernel's socket.h cmsg_nxthdr() code and the 32-bit emulation layer version of it at net/compat.c cmsg_compat_nxthdr() for further info. Notice how the former graciously ignores incomplete CMSGs while the latter _directly_ complains about them -- as of kernel version 4.9-rc5. (A kernel patch is to be submitted) Details: iochannel typically uses sendmsg() for passing FDs & credentials. >From RFC 2292, sendmsg() control data is just a heterogeneous array of embedded ancillary objects that can differ in length. Linguistically, a "control message" is an ancillary data object. For example, below is a sendmsg() "msg_control" containing two ancillary objects: |<---------------------- msg_controllen---------------------->| | | |<--- ancillary data object -->|<----- ancillary data object->| |<------- CMSG_SPACE() ------->|<------- CMSG_SPACE() ------->| | | | |<-------- cmsg_len ------->| |<-------- cmsg_len ------->| | |<------- CMSG_LEN() ------>| |<------- CMSG_LEN() ------>| | | | | | | +-----+-----+-----+--+------+--+-----+-----+-----+--+------+--+ |cmsg_|cmsg_|cmsg_|XX|cmsg_ |XX|cmsg_|cmsg_|cmsg_|XX|cmsg_ |XX| |len |level|type |XX|data[]|XX|len |level|type |XX|data[]|XX| +-----+-----+-----+--+------+--+-----+-----+-----+--+----+-+--+ ^^^^^^^ Ancil Object #1 ^^^^^^^ Ancil Object #2 (control message) (control message) ^ | +--- sendmsg() "msg_control" points here Problem is, while passing FDs, iochannel's code try to avoid variable-length arrays by creating a single cmsg object that can fit as much FDs as possible: union { struct cmsghdr hdr; uint8_t data[CMSG_SPACE(sizeof(int) * MAX_ANCIL_DATA_FDS)]; } cmsg; ^^^^^^^^^^^^^^^^^^ Most of the time though the number of FDs to be passed is less than the maximum above, thus "cmsg_len" is set to the _actual_ FD array size: cmsg.hdr.cmsg_len = CMSG_LEN(sizeof(int) * nfd); ^^^ This inconsistency tricks the kernel into thinking that we have 2 ancillay data objects instead of one! First cmsg is valid as intended, but the second is instantly _corrupt_ since it has a cmsg_len size of 0 -- thus failing kernel's CMSG_OK() tests. For 32-bit apps on a 32-bit kernel, and 64-bit apps over a 64-bit one, the kernel's own CMSG header traversal macros just ignore the second "incomplete" cmsg. For 32-bit apps over a 64-bit kernel though, the kernel 32-bit socket emulation macros does not forgive such incompleteness and directly complains of invalid args (due to a subtle bug). Avoid this ugly problem, which can also bite us in a pure 64-bit environment if MAX_ANCIL_DATA_FDS got extended to 5 FDs, by setting "cmsg_data[]" array size to "cmsg_len". BugLink: https://bugs.freedesktop.org/show_bug.cgi?id=97769 Signed-off-by: Ahmed S. Darwish <darwish.07 at gmail.com> --- Hopefully variable-length arrays won't be problematic in esoteric build environments? src/pulsecore/iochannel.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/src/pulsecore/iochannel.c b/src/pulsecore/iochannel.c index e62750b..8ace297 100644 --- a/src/pulsecore/iochannel.c +++ b/src/pulsecore/iochannel.c @@ -346,6 +346,8 @@ ssize_t pa_iochannel_write_with_creds(pa_iochannel*io, const void*data, size_t l return r; } +/* For more details on FD passing, check the cmsg(3) manpage + * and IETF RFC #2292: "Advanced Sockets API for IPv6" */ ssize_t pa_iochannel_write_with_fds(pa_iochannel*io, const void*data, size_t l, int nfd, const int *fds) { ssize_t r; int *msgdata; @@ -353,7 +355,7 @@ ssize_t pa_iochannel_write_with_fds(pa_iochannel*io, const void*data, size_t l, struct iovec iov; union { struct cmsghdr hdr; - uint8_t data[CMSG_SPACE(sizeof(int) * MAX_ANCIL_DATA_FDS)]; + uint8_t data[CMSG_SPACE(sizeof(int) * nfd)]; } cmsg; pa_assert(io); -- 2.4.3