On 18/03/2021 00:15, Stefan Metzmacher wrote: > Hi Pavel, > >>>>>> here're patches which fix linking of send[msg]()/recv[msg]() calls >>>>>> and make sure io_uring_enter() never generate a SIGPIPE. >>>> >>>> 1/2 breaks userspace. >>> >>> Can you explain that a bit please, how could some application ever >>> have a useful use of IOSQE_IO_LINK with these socket calls? >> >> Packet delivery of variable size, i.e. recv(max_size). Byte stream >> that consumes whatever you've got and links something (e.g. notification >> delivery, or poll). Not sure about netlink, but maybe. Or some >> "create a file via send" crap, or some made-up custom protocols > > Ok, then we need a flag or a new opcode to provide that behavior? > > For recv() and recvmsg() MSG_WAITALL might be usable. Hmm, unrelated, but there is a good chance MSG_WAITALL with io_uring is broken because of our first MSG_DONTWAIT attempt. > It's not defined in 'man 2 sendmsg', but should we use it anyway > for IORING_OP_SEND[MSG] in order to activate the short send check > as the low level sock_sendmsg() call seem to ignore unused flags, > which seems to be the reason for the following logic in tcp_sendmsg_locked: > > if (flags & MSG_ZEROCOPY && size && sock_flag(sk, SOCK_ZEROCOPY)) { Yep, it maintains compatibility because of unchecked unsupported flags. Alleviating an old design problem, IIRC. > > You need to set SOCK_ZEROCOPY in the socket in order to give a meaning > to MSG_ZEROCOPY. > > Should I prepare an add-on patch to make the short send/recv logic depend > on MSG_WAITALL? IMHO, conceptually it would make much more sense with MSG_WAITALL. > > I'm cc'ing netdev@xxxxxxxxxxxxxxx in order to more feedback of > MSG_WAITALL can be passed to sendmsg without fear to trigger > -EINVAL. > > The example for io_sendmsg() would look like this: > > --- a/fs/io_uring.c > +++ b/fs/io_uring.c > @@ -4383,7 +4383,7 @@ static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) > struct io_async_msghdr iomsg, *kmsg; > struct socket *sock; > unsigned flags; > - int expected_ret; > + int min_ret = 0; > int ret; > > sock = sock_from_file(req->file); > @@ -4404,9 +4404,11 @@ static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) > else if (issue_flags & IO_URING_F_NONBLOCK) > flags |= MSG_DONTWAIT; > > - expected_ret = iov_iter_count(&kmsg->msg.msg_iter); > - if (unlikely(expected_ret == MAX_RW_COUNT)) > - expected_ret += 1; > + if (flags & MSG_WAITALL) { > + min_ret = iov_iter_count(&kmsg->msg.msg_iter); > + if (unlikely(min_ret == MAX_RW_COUNT)) > + min_ret += 1; > + } > ret = __sys_sendmsg_sock(sock, &kmsg->msg, flags); > if ((issue_flags & IO_URING_F_NONBLOCK) && ret == -EAGAIN) > return io_setup_async_msg(req, kmsg); > @@ -4417,7 +4419,7 @@ static int io_sendmsg(struct io_kiocb *req, unsigned int issue_flags) > if (kmsg->free_iov) > kfree(kmsg->free_iov); > req->flags &= ~REQ_F_NEED_CLEANUP; > - if (ret != expected_ret) > + if (ret < min_ret) > req_set_fail_links(req); > __io_req_complete(req, issue_flags, ret, 0); > return 0; > > Which means the default of min_ret = 0 would result in: > > if (ret < 0) > req_set_fail_links(req); > > again... > >>>> Sounds like 2/2 might too, does it? >>> >>> Do you think any application really expects to get a SIGPIPE >>> when calling io_uring_enter()? >> >> If it was about what I think I would remove lots of old garbage :) >> I doubt it wasn't working well before, e.g. because of iowq, but >> who knows > > Yes, it was inconsistent before and now it's reliable. -- Pavel Begunkov