On Mon, Jun 26, 2023 at 11:05 PM David Howells <dhowells@xxxxxxxxxx> wrote: > > Fix the mishandling of MSG_DONTWAIT and also reinstates the per-page > checking of the source pages (which might have come from a DIO write by > userspace) by partially reverting the changes to support MSG_SPLICE_PAGES > and doing things a little differently. In messenger_v1: > > (1) The ceph_tcp_sendpage() is resurrected and the callers reverted to use > that. > > (2) The callers now pass MSG_MORE unconditionally. Previously, they were > passing in MSG_MORE|MSG_SENDPAGE_NOTLAST and then degrading that to > just MSG_MORE on the last call to ->sendpage(). > > (3) Make ceph_tcp_sendpage() a wrapper around sendmsg() rather than > sendpage(), setting MSG_SPLICE_PAGES if sendpage_ok() returns true on > the page. > > In messenger_v2: > > (4) Bring back do_try_sendpage() and make the callers use that. > > (5) Make do_try_sendpage() use sendmsg() for both cases and set > MSG_SPLICE_PAGES if sendpage_ok() is set. > > Fixes: 40a8c17aa770 ("ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage") > Fixes: fa094ccae1e7 ("ceph: Use sendmsg(MSG_SPLICE_PAGES) rather than sendpage()") > Reported-by: Ilya Dryomov <idryomov@xxxxxxxxx> > Link: https://lore.kernel.org/r/CAOi1vP9vjLfk3W+AJFeexC93jqPaPUn2dD_4NrzxwoZTbYfOnw@xxxxxxxxxxxxxx/ > Link: https://lore.kernel.org/r/CAOi1vP_Bn918j24S94MuGyn+Gxk212btw7yWeDrRcW1U8pc_BA@xxxxxxxxxxxxxx/ > Signed-off-by: David Howells <dhowells@xxxxxxxxxx> > cc: Ilya Dryomov <idryomov@xxxxxxxxx> > cc: Xiubo Li <xiubli@xxxxxxxxxx> > cc: Jeff Layton <jlayton@xxxxxxxxxx> > cc: "David S. Miller" <davem@xxxxxxxxxxxxx> > cc: Eric Dumazet <edumazet@xxxxxxxxxx> > cc: Jakub Kicinski <kuba@xxxxxxxxxx> > cc: Paolo Abeni <pabeni@xxxxxxxxxx> > cc: Jens Axboe <axboe@xxxxxxxxx> > cc: Matthew Wilcox <willy@xxxxxxxxxxxxx> > cc: ceph-devel@xxxxxxxxxxxxxxx > cc: netdev@xxxxxxxxxxxxxxx > Link: https://lore.kernel.org/r/3101881.1687801973@xxxxxxxxxxxxxxxxxxxxxx/ # v1 > --- > Notes: > ver #2) > - Removed mention of MSG_SENDPAGE_NOTLAST in comments. > - Changed some refs to sendpage to MSG_SPLICE_PAGES in comments. > - Init msg_iter in ceph_tcp_sendpage(). > - Move setting of MSG_SPLICE_PAGES in do_try_sendpage() next to comment > and adjust how it is cleared. > > net/ceph/messenger_v1.c | 58 ++++++++++++++++++++----------- > net/ceph/messenger_v2.c | 88 ++++++++++++++++++++++++++++++++++++++---------- > 2 files changed, 107 insertions(+), 39 deletions(-) > > diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c > index 814579f27f04..51a6f28aa798 100644 > --- a/net/ceph/messenger_v1.c > +++ b/net/ceph/messenger_v1.c > @@ -74,6 +74,39 @@ static int ceph_tcp_sendmsg(struct socket *sock, struct kvec *iov, > return r; > } > > +/* > + * @more: MSG_MORE or 0. > + */ > +static int ceph_tcp_sendpage(struct socket *sock, struct page *page, > + int offset, size_t size, int more) > +{ > + struct msghdr msg = { > + .msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL | more, > + }; > + struct bio_vec bvec; > + int ret; > + > + /* > + * MSG_SPLICE_PAGES cannot properly handle pages with page_count == 0, > + * we need to fall back to sendmsg if that's the case. > + * > + * Same goes for slab pages: skb_can_coalesce() allows > + * coalescing neighboring slab objects into a single frag which > + * triggers one of hardened usercopy checks. > + */ > + if (sendpage_ok(page)) > + msg.msg_flags |= MSG_SPLICE_PAGES; > + > + bvec_set_page(&bvec, page, size, offset); > + iov_iter_bvec(&msg.msg_iter, ITER_DEST, &bvec, 1, size); Hi David, Shouldn't this be ITER_SOURCE? Thanks, Ilya