Re: [PATCH v2] recv.2: Document MSG_CMSG_CLOEXEC as returned in msg_flags

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Matthew,

On 2023-07-18 08:00, Matthew House wrote:
> On Mon, Jul 17, 2023 at 7:10 PM Alejandro Colomar <alx@xxxxxxxxxx> wrote:
>> Hi Matthew,
>>
>> I don't understand what's the purpose of this.  The kernel sets a bit
>> just to report to the caller that it set a bit?  No other purpose?
>> It feels very weird.  Of course, the caller already has that info,
>> doesn't it?
> 
> The main reason I posted this patch was because I was confused by the
> flag's presence in the msg_flags when I was looking at some strace logs, so
> I figured that it would be a good idea to document it.

Makes sense.

> As for the original
> purpose of the behavior, it's not really clear, and it may well have been
> an implementation artifact that got enshrined in the user space ABI. (Even
> io_uring is careful to replicate this behavior!)

This is what worries me.  I've CCd a bunch of people to see if they can
bring some light.

> 
> This behavior began when the MSG_CMSG_CLOEXEC flag was first added in Linux
> 2.6.23, with Ulrich Drepper's commit 4a19542e5f69 ("O_CLOEXEC for
> SCM_RIGHTS"). Per the commit message, the flag was designed to be
> "passe[d]... just like the existing MSG_CMSG_COMPAT flag". Since it was
> added to the msg_flags at the start of sys_recvmsg(), the
> scm_detach_fds[_compat]() functions in net/core/scm.c and net/compat.c
> could read the flag off of msg->msg_flags without having to thread the
> recvmsg() flags through.
> 
> This was indeed similar to the behavior of MSG_CMSG_COMPAT. That flag was
> added in Linux 2.5.65, with commit 3225fc8a85f4 ("[NET]: Simplify scm
> handling and sendmsg/recvmsg invocation, consolidate net compat
> syscalls."), in which put_cmsg() and scm_detach_fds() in net/core/scm.c
> read it off of msg->msg_flags. (It wouldn't actually be set in msg_flags
> until Linux 2.5.67, with commit 7e8d06bc1d90, "[COMPAT]: Fix
> MSG_CMSG_COMPAT flag passing, kill cmsg_compat_recvmsg_fixup." Both of
> these commits are from history/history.git.)
> 
> However, the MSG_CMSG_COMPAT flag has been scrubbed from the output
> msg_flags since Linux 2.6.14, with commit 37f7f421cce1 ("[NET]: Do not leak
> MSG_CMSG_COMPAT into userspace."). That's what I find so unclear:
> MSG_CMSG_CLOEXEC was added after the kernel started scrubbing
> MSG_CMSG_COMPAT from the output, but the new flag was never written to be
> similarly scrubbed.
> 
> Later, in Linux 3.10, with commits 1be374a0518a ("net: Block
> MSG_CMSG_COMPAT in send(m)msg and recv(m)msg") and a7526eb5d06b ("net:
> Unbreak compat_sys_{send,recv}msg"), MSG_CMSG_COMPAT was banned from being
> passed to the *msg() syscalls' flags from user space, with the rationale
> that they were "not intended to be part of the API". Then, in Linux 4.0, we
> reached the current status quo with commit d720d8cec563 ("net: compat:
> Ignore MSG_CMSG_COMPAT in compat_sys_{send, recv}msg"), where
> MSG_CMSG_COMPAT is allowed (and a no-op) in compat syscalls, but banned
> from non-compat syscalls.
> 
> So I agree that it's very weird that this flag gets returned to user space,
> even while the internal flag that it's modeled after doesn't. I suppose I
> could spin up a nice story, where the user-space function calling recvmsg()
> is totally separate from the function processing the returned struct
> msghdr, and the latter function would really like to know whether the fds
> in that message are close-on-exec without having to call fcntl(F_GETFD).
> But that's all just a total guess. If you want to know for sure, perhaps
> cc'ing Drepper may be worthwhile?
> 
> A cursory look hasn't shown me any existing user-space code that depends on
> this behavior. Though one library appears to be aware of this behavior,
> actively filtering MSG_CMSG_CLOEXEC out of the result flags:
> <https://github.com/dutchanddutch/node-socket-calls/blob/ca759a0da87cb112875d158f4a81b45b31f4a871/src/socket_calls.cc#L417>
> 
> Also, only somewhat relatedly, some libraries incorrectly attempt to
> request MSG_CMSG_CLOEXEC by passing it into the msg_flags field instead of
> the flags argument:
> <https://git.samba.org/samba.git/?p=samba.git;a=blob;f=lib/messaging/messages_dgm.c;hb=refs/tags/samba-4.17.9#l1272>
> <https://github.com/genodelabs/genode/blob/23.05/repos/base-linux/src/lib/base/ipc.cc#L132>
> <https://github.com/proxmox/pve-lxc-syscalld/blob/a14430f3e75c2b695332ad712164e599464177fc/src/io/seq_packet.rs#L123>

In any case, all of this mail has been very interesting,
and it would be useful to have it in the commit message of the patch.
Please send an v2 with it, and add 'Cc:' tags for all these people:

Cc: <linux-api@xxxxxxxxxxxxxxx>
Cc: <netdev@xxxxxxxxxxxxxxx>
Cc: Ulrich Drepper <drepper@xxxxxxxxx>
Cc: Ulrich Drepper <drepper@xxxxxxxxxx>
Cc: "David S. Miller" <davem@xxxxxxxxxxxxx>
Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>

(I don't know if <drepper@xxxxxxxxxx> still works.  Does anyone know?)

Thanks!
Alex

> 
> Thank you,
> Matthew House

-- 
<http://www.alejandro-colomar.es/>
GPG key fingerprint: A9348594CE31283A826FBDD8D57633D441E25BB5

Attachment: OpenPGP_signature
Description: OpenPGP digital signature


[Index of Archives]     [Kernel Documentation]     [Netdev]     [Linux Ethernet Bridging]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]

  Powered by Linux