On 2021/8/10 11:40, Zhu Yanjun wrote: > On Mon, Aug 9, 2021 at 10:43 PM Xiao Yang <yangx.jy@xxxxxxxxxxx> wrote: >> Resid indicates the residual length of transmitted bytes but current >> rxe sets it to zero for inline data at the beginning. In this case, >> request will transmit zero byte to responder wrongly. > What are the symptoms if resid is set to zero? Hi Yanjun, You can construct code by the attached patch and then run rdma_client/server to reproduce the issue. If resid is set to zero, running rdma_client/server: -------------------------------- # ./rdma_client -s 10.167.220.99 -p 8765 rdma_client: start rdma_client: end 0 # ./rdma_server -s 10.167.220.99 -p 8765 rdma_server: start wc.byte_len 0 recv_msg bbbbbbbbbbbbbbbb rdma_server: end 0 -------------------------------- rdma_client sends zero byte instead of 16 btyes to rdma_server. rdma_server receives zero byte instead of 16 btyes from rdma_client. Please also see the logic about resid in kernel, for example: -------------------------------- int rxe_requester(void *arg) { ... payload = (mask & RXE_WRITE_OR_SEND) ? wqe->dma.resid : 0; ... skb = init_req_packet(qp, wqe, opcode, payload, &pkt); ... } -------------------------------- Best Regards, Xiao Yang > Thanks > Zhu Yanjun > >> Resid should be set to the total length of transmitted bytes at the >> beginning. >> >> Note: >> Just remove the useless setting of resid in init_send_wqe(). >> >> Fixes: 1a894ca10105 ("Providers/rxe: Implement ibv_create_qp_ex verb") >> Fixes: 8337db5df125 ("Providers/rxe: Implement memory windows") >> Signed-off-by: Xiao Yang <yangx.jy@xxxxxxxxxxx> >> --- >> providers/rxe/rxe.c | 5 ++--- >> 1 file changed, 2 insertions(+), 3 deletions(-) >> >> diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c >> index 3c3ea8bb..3533a325 100644 >> --- a/providers/rxe/rxe.c >> +++ b/providers/rxe/rxe.c >> @@ -1004,7 +1004,7 @@ static void wr_set_inline_data(struct ibv_qp_ex *ibqp, void *addr, >> >> memcpy(wqe->dma.inline_data, addr, length); >> wqe->dma.length = length; >> - wqe->dma.resid = 0; >> + wqe->dma.resid = length; >> } >> >> static void wr_set_inline_data_list(struct ibv_qp_ex *ibqp, size_t num_buf, >> @@ -1035,6 +1035,7 @@ static void wr_set_inline_data_list(struct ibv_qp_ex *ibqp, size_t num_buf, >> } >> >> wqe->dma.length = tot_length; >> + wqe->dma.resid = tot_length; >> } >> >> static void wr_set_sge(struct ibv_qp_ex *ibqp, uint32_t lkey, uint64_t addr, >> @@ -1473,8 +1474,6 @@ static int init_send_wqe(struct rxe_qp *qp, struct rxe_wq *sq, >> if (ibwr->send_flags & IBV_SEND_INLINE) { >> uint8_t *inline_data = wqe->dma.inline_data; >> >> - wqe->dma.resid = 0; >> - >> for (i = 0; i < num_sge; i++) { >> memcpy(inline_data, >> (uint8_t *)(long)ibwr->sg_list[i].addr, >> -- >> 2.25.1 >> >> >>
From 5f2ff0e6f5ee3c4431a27e6bfba0868ad59ef56f Mon Sep 17 00:00:00 2001 From: Xiao Yang <yangx.jy@xxxxxxxxxxx> Date: Wed, 11 Aug 2021 00:46:31 +0800 Subject: [PATCH] rxe: construct code and reproduce the issue by rdma_client/server Signed-off-by: Xiao Yang <yangx.jy@xxxxxxxxxxx> --- librdmacm/examples/rdma_client.c | 2 ++ librdmacm/examples/rdma_server.c | 4 ++++ providers/rxe/rxe.c | 2 +- 3 files changed, 7 insertions(+), 1 deletion(-) diff --git a/librdmacm/examples/rdma_client.c b/librdmacm/examples/rdma_client.c index c27047c5..6e568110 100644 --- a/librdmacm/examples/rdma_client.c +++ b/librdmacm/examples/rdma_client.c @@ -52,6 +52,8 @@ static int run(void) struct ibv_wc wc; int ret; + memset(send_msg, 'a', 16); + memset(&hints, 0, sizeof hints); hints.ai_port_space = RDMA_PS_TCP; ret = rdma_getaddrinfo(server, port, &hints, &res); diff --git a/librdmacm/examples/rdma_server.c b/librdmacm/examples/rdma_server.c index f9c766b2..ffb440c6 100644 --- a/librdmacm/examples/rdma_server.c +++ b/librdmacm/examples/rdma_server.c @@ -53,6 +53,8 @@ static int run(void) struct ibv_wc wc; int ret; + memset(recv_msg, 'b', 16); + memset(&hints, 0, sizeof hints); hints.ai_flags = RAI_PASSIVE; hints.ai_port_space = RDMA_PS_TCP; @@ -132,6 +134,8 @@ static int run(void) goto out_disconnect; } + printf("wc.byte_len %u recv_msg %s\n", wc.byte_len, recv_msg); + ret = rdma_post_send(id, NULL, send_msg, 16, send_mr, send_flags); if (ret) { perror("rdma_post_send"); diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c index 3c3ea8bb..ed5479fe 100644 --- a/providers/rxe/rxe.c +++ b/providers/rxe/rxe.c @@ -1492,7 +1492,7 @@ static int init_send_wqe(struct rxe_qp *qp, struct rxe_wq *sq, wqe->iova = ibwr->wr.rdma.remote_addr; wqe->dma.length = length; - wqe->dma.resid = length; + wqe->dma.resid = 0; wqe->dma.num_sge = num_sge; wqe->dma.cur_sge = 0; wqe->dma.sge_offset = 0; -- 2.25.1