On Wed, Aug 11, 2021 at 12:33 AM yangx.jy@xxxxxxxxxxx <yangx.jy@xxxxxxxxxxx> wrote: > > On 2021/8/10 11:40, Zhu Yanjun wrote: > > On Mon, Aug 9, 2021 at 10:43 PM Xiao Yang <yangx.jy@xxxxxxxxxxx> wrote: > >> Resid indicates the residual length of transmitted bytes but current > >> rxe sets it to zero for inline data at the beginning. In this case, > >> request will transmit zero byte to responder wrongly. > > What are the symptoms if resid is set to zero? > Hi Yanjun, > > You can construct code by the attached patch and then run > rdma_client/server to reproduce the issue. > > If resid is set to zero, running rdma_client/server: > -------------------------------- > # ./rdma_client -s 10.167.220.99 -p 8765 > rdma_client: start > rdma_client: end 0 > > # ./rdma_server -s 10.167.220.99 -p 8765 > rdma_server: start > wc.byte_len 0 recv_msg bbbbbbbbbbbbbbbb > rdma_server: end 0 > -------------------------------- > > rdma_client sends zero byte instead of 16 btyes to rdma_server. > rdma_server receives zero byte instead of 16 btyes from rdma_client. > > Please also see the logic about resid in kernel, for example: > -------------------------------- > int rxe_requester(void *arg) > { > ... > payload = (mask & RXE_WRITE_OR_SEND) ? wqe->dma.resid : 0; > ... > skb = init_req_packet(qp, wqe, opcode, payload, &pkt); > ... > } > -------------------------------- > > Best Regards, > Xiao Yang Follow your steps on the latest rdma-core and linux upstream, make tests with the followings: " diff --git a/librdmacm/examples/rdma_client.c b/librdmacm/examples/rdma_client.c index c27047c5..6734757b 100644 --- a/librdmacm/examples/rdma_client.c +++ b/librdmacm/examples/rdma_client.c @@ -52,6 +52,7 @@ static int run(void) struct ibv_wc wc; int ret; + memset(send_msg, 'a', 16); memset(&hints, 0, sizeof hints); hints.ai_port_space = RDMA_PS_TCP; ret = rdma_getaddrinfo(server, port, &hints, &res); diff --git a/librdmacm/examples/rdma_server.c b/librdmacm/examples/rdma_server.c index f9c766b2..afa20996 100644 --- a/librdmacm/examples/rdma_server.c +++ b/librdmacm/examples/rdma_server.c @@ -132,6 +132,7 @@ static int run(void) goto out_disconnect; } + printf("wc.byte_len %u recv_msg %s\n", wc.byte_len, recv_msg); ret = rdma_post_send(id, NULL, send_msg, 16, send_mr, send_flags); if (ret) { perror("rdma_post_send"); " The followings are results: # ./build/bin/rdma_server -s 10.238.154.61 -p 5486& [1] 10812 # rdma_server: start # ./build/bin/rdma_client -s 10.238.154.61 -p 5486& [2] 10815 # rdma_client: start wc.byte_len 16 recv_msg aaaaaaaaaaaaaaaa <--------------it seems that inline is 16? rdma_server: end 0 rdma_client: end 0 [1]- Done ./build/bin/rdma_server -s 10.238.154.61 -p 5486 [2]+ Done ./build/bin/rdma_client -s 10.238.154.61 -p 5486 What does your commit fix? Zhu Yanjun > > Thanks > > Zhu Yanjun > > > >> Resid should be set to the total length of transmitted bytes at the > >> beginning. > >> > >> Note: > >> Just remove the useless setting of resid in init_send_wqe(). > >> > >> Fixes: 1a894ca10105 ("Providers/rxe: Implement ibv_create_qp_ex verb") > >> Fixes: 8337db5df125 ("Providers/rxe: Implement memory windows") > >> Signed-off-by: Xiao Yang <yangx.jy@xxxxxxxxxxx> > >> --- > >> providers/rxe/rxe.c | 5 ++--- > >> 1 file changed, 2 insertions(+), 3 deletions(-) > >> > >> diff --git a/providers/rxe/rxe.c b/providers/rxe/rxe.c > >> index 3c3ea8bb..3533a325 100644 > >> --- a/providers/rxe/rxe.c > >> +++ b/providers/rxe/rxe.c > >> @@ -1004,7 +1004,7 @@ static void wr_set_inline_data(struct ibv_qp_ex *ibqp, void *addr, > >> > >> memcpy(wqe->dma.inline_data, addr, length); > >> wqe->dma.length = length; > >> - wqe->dma.resid = 0; > >> + wqe->dma.resid = length; > >> } > >> > >> static void wr_set_inline_data_list(struct ibv_qp_ex *ibqp, size_t num_buf, > >> @@ -1035,6 +1035,7 @@ static void wr_set_inline_data_list(struct ibv_qp_ex *ibqp, size_t num_buf, > >> } > >> > >> wqe->dma.length = tot_length; > >> + wqe->dma.resid = tot_length; > >> } > >> > >> static void wr_set_sge(struct ibv_qp_ex *ibqp, uint32_t lkey, uint64_t addr, > >> @@ -1473,8 +1474,6 @@ static int init_send_wqe(struct rxe_qp *qp, struct rxe_wq *sq, > >> if (ibwr->send_flags & IBV_SEND_INLINE) { > >> uint8_t *inline_data = wqe->dma.inline_data; > >> > >> - wqe->dma.resid = 0; > >> - > >> for (i = 0; i < num_sge; i++) { > >> memcpy(inline_data, > >> (uint8_t *)(long)ibwr->sg_list[i].addr, > >> -- > >> 2.25.1 > >> > >> > >> >