Re: 【BugReport】ibv_srq_pingpong test bug

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/11/2019 10:26 AM, oulijun wrote:
Hi, Roland Dreier and others

          I am using ibv_srq_pingpong to test based on hip08. The test result as follows:

           local address:  LID 0x0000, QPN 0x0000ff, PSN 0xdca3b1, GID ::

           local address:  LID 0x0000, QPN 0x000100, PSN 0xf62247, GID ::

           local address:  LID 0x0000, QPN 0x000101, PSN 0x7de385, GID ::

           local address:  LID 0x0000, QPN 0x000102, PSN 0xc5fcf0, GID ::

          local address:  LID 0x0000, QPN 0x000103, PSN 0x3e0843, GID ::

           local address:  LID 0x0000, QPN 0x000104, PSN 0x320be9, GID ::

           local address:  LID 0x0000, QPN 0x000105, PSN 0xb82994, GID ::

           local address:  LID 0x0000, QPN 0x000106, PSN 0xf9e7fd, GID ::

           local address:  LID 0x0000, QPN 0x000107, PSN 0xdfee5d, GID ::

           local address:  LID 0x0000, QPN 0x000108, PSN 0x02891b, GID ::

           local address:  LID 0x0000, QPN 0x000109, PSN 0x37d823, GID ::

           local address:  LID 0x0000, QPN 0x00010a, PSN 0x75397a, GID ::

           local address:  LID 0x0000, QPN 0x00010b, PSN 0x0e02de, GID ::

           local address:  LID 0x0000, QPN 0x00010c, PSN 0x7e9633, GID ::

           local address:  LID 0x0000, QPN 0x00010d, PSN 0x5b4a75, GID ::

           local address:  LID 0x0000, QPN 0x00010e, PSN 0xe9a195, GID ::

Failed to modify QP[0] to RTR


As of the below trace it looks as you are using RoCE, correct ? if so, you need to supply a gid in the command line (e.g -g 0).

Couldn't connect to remote QP

           I am targeting as follows:

           When called the ibv_modify_qp run and it will trace as follows:

static int rdma_check_ah_attr(struct ib_device *device,

409                               struct rdma_ah_attr *ah_attr)

410 {

411         if (!rdma_is_port_valid(device, ah_attr->port_num))

412                 return -EINVAL;

413         printk("[%s, %d] point!\n", __func__, __LINE__);

414         printk("[%s, %d] rdma_is_grh_required(device, ah_attr->port_num) = %d\n",

415                 __func__, __LINE__, rdma_is_grh_required(device, ah_attr->port_num));

416         printk("[%s, %d] ah_attr->type = %d!\n", __func__, __LINE__, ah_attr->type);

417         printk("[%s, %d] ah_attr->ah_flags = %d!\n", __func__, __LINE__, ah_attr->ah_flags);

418         if ((rdma_is_grh_required(device, ah_attr->port_num) ||

419              ah_attr->type == RDMA_AH_ATTR_TYPE_ROCE) &&

420             !(ah_attr->ah_flags & IB_AH_GRH))

421                 return -EINVAL;

422         printk("[%s, %d] point!\n", __func__, __LINE__);

423         if (ah_attr->grh.sgid_attr) {

424                 /*

425                  * Make sure the passed sgid_attr is consistent with the

426                  * parameters

427                  */

428                 if (ah_attr->grh.sgid_attr->index != ah_attr->grh.sgid_index ||

429                     ah_attr->grh.sgid_attr->port_num != ah_attr->port_num)

430                         return -EINVAL;

431         }

432         printk("[%s, %d] point!\n", __func__, __LINE__);

433         return 0;

When trace at 420 lines, it will return fail.  I don’t understand the lines. Because it should be right  when run roce mode.

The ah_attr->ah_flags is RDMA_AH_ATTR_TYPE_ROCE and ah_attr->ah_flags should be IB_AH_GRH

However the value of ah_attr->ah_flags is 2.  I think that the value of attr->ah_flags should have a protocol layer guarantee

So, I doubt that the protocol layer or ibv_srq_pingpong have an achieve defects

At the same time I used ibv_srq_pingpong to test on cx5,  the result is the same:

root@ubuntu-51-7:~# ibv_srq_pingpong -d mlx5_0 -p 10002

   local address:  LID 0x0000, QPN 0x0000ff, PSN 0xdca3b1, GID ::

   local address:  LID 0x0000, QPN 0x000100, PSN 0xf62247, GID ::

   local address:  LID 0x0000, QPN 0x000101, PSN 0x7de385, GID ::

   local address:  LID 0x0000, QPN 0x000102, PSN 0xc5fcf0, GID ::

   local address:  LID 0x0000, QPN 0x000103, PSN 0x3e0843, GID ::

   local address:  LID 0x0000, QPN 0x000104, PSN 0x320be9, GID ::

   local address:  LID 0x0000, QPN 0x000105, PSN 0xb82994, GID ::

   local address:  LID 0x0000, QPN 0x000106, PSN 0xf9e7fd, GID ::

   local address:  LID 0x0000, QPN 0x000107, PSN 0xdfee5d, GID ::

   local address:  LID 0x0000, QPN 0x000108, PSN 0x02891b, GID ::

   local address:  LID 0x0000, QPN 0x000109, PSN 0x37d823, GID ::

   local address:  LID 0x0000, QPN 0x00010a, PSN 0x75397a, GID ::

   local address:  LID 0x0000, QPN 0x00010b, PSN 0x0e02de, GID ::

   local address:  LID 0x0000, QPN 0x00010c, PSN 0x7e9633, GID ::

   local address:  LID 0x0000, QPN 0x00010d, PSN 0x5b4a75, GID ::

   local address:  LID 0x0000, QPN 0x00010e, PSN 0xe9a195, GID ::

Failed to modify QP[0] to RTR

Couldn't connect to remote QP

Thanks

Lijun Ou





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux