On Tue, 2023-06-27 at 14:09 +0100, David Howells wrote: > Paolo Abeni <pabeni@xxxxxxxxxx> wrote: > > > udp_sendmsg() can set the MSG_TRUNC bit in msg->msg_flags, so I guess > > that kind of actions are sort of allowed. Still, AFAICS, the kernel > > based msghdr is not copied back to the user-space, so such change > > should be almost a no-op in practice. > > > > @David: which would be the end goal for such action? > > Various places in the kernel use sock_sendmsg() - and I've added a bunch more > with the MSG_SPLICE_PAGES patches. For some of the things I've added, there's > a loop which used to call ->sendpage() and now calls sock_sendmsg(). In most > of those places, msghdr will get reset each time round the loop - but not in > all cases. > > Of particular immediate interest is net/ceph/messenger_v2.c. If you go to: > > https://lore.kernel.org/r/3111635.1687813501@xxxxxxxxxxxxxxxxxxxxxx/ > > and look at the resultant code: > > static int do_sendmsg(struct socket *sock, struct iov_iter *it) > { > struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; > int ret; > > msg.msg_iter = *it; > while (iov_iter_count(it)) { > ret = sock_sendmsg(sock, &msg); > if (ret <= 0) { > if (ret == -EAGAIN) > ret = 0; > return ret; > } > > iov_iter_advance(it, ret); > } > > WARN_ON(msg_data_left(&msg)); > return 1; > } > > for example. It could/would malfunction if sendmsg() is allowed to modify > msghdr - or if it doesn't update msg_iter. Likewise: > > static int do_try_sendpage(struct socket *sock, struct iov_iter *it) > { > struct msghdr msg = { .msg_flags = CEPH_MSG_FLAGS }; > struct bio_vec bv; > int ret; > > if (WARN_ON(!iov_iter_is_bvec(it))) > return -EINVAL; > > while (iov_iter_count(it)) { > /* iov_iter_iovec() for ITER_BVEC */ > bvec_set_page(&bv, it->bvec->bv_page, > min(iov_iter_count(it), > it->bvec->bv_len - it->iov_offset), > it->bvec->bv_offset + it->iov_offset); > > /* > * MSG_SPLICE_PAGES cannot properly handle pages with > * page_count == 0, we need to fall back to sendmsg if > * that's the case. > * > * Same goes for slab pages: skb_can_coalesce() allows > * coalescing neighboring slab objects into a single frag > * which triggers one of hardened usercopy checks. > */ > if (sendpage_ok(bv.bv_page)) > msg.msg_flags |= MSG_SPLICE_PAGES; > else > msg.msg_flags &= ~MSG_SPLICE_PAGES; > > iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, &bv, 1, bv.bv_len); > ret = sock_sendmsg(sock, &msg); > if (ret <= 0) { > if (ret == -EAGAIN) > ret = 0; > return ret; > } > > iov_iter_advance(it, ret); > } > > return 1; > } > > could be similarly affected if ->sendmsg() mucks about with msg_flags. With some help from the compiler - locally changing the proto_ops and proto sendmsg definition and handling the fallout - I found the following: - mptcp_sendmsg() is clearing unsupported msg_flags (I should have recalled this one even without much testing ;) - udpv4_sendmsg() is setting msg_name/msg_namelen - tls_device_sendmsg() is clearing MSG_SPLICE_PAGE when zerocopy is not supported - unix_seqpacket_sendmsg() is clearing msg_namelen I could have missed something, but the above looks safe for the use- case you mentioned. Cheers, Paolo