Re: rdma_create_qp_ex fails with EINVAL

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/7/2023 6:13 AM, Haeuptle, Michael wrote:
External email: Use caution opening links or attachments


Hello,

I'm running into an issue where rdma_create_qp_ex returns EINVAL and I was hoping that someone could help me understand what is going on here.

The function that is actually throwing the EINVAL error is the write() call in rdma_init_qp_attr (which is being called by rdma_create_qp_ex):
...
     ret = write(id->channel->fd, &cmd, sizeof cmd);
...

It returns -1 and sets errno to 22.

Note, this is an intermittent error and not always reproducible.

The setup and scenario is as follows:
- SPDK NVMF target on Debian 11.3 with top of tree rdma-core libs
- NVMe-oF kernel initiator, Debain 11.5 (no change in rdma-core libs)
- There is a switch between initiator and SPDK NVMF targets
- The kernel initiator is taking to 2 SPDK NVMF targets via DM and round-robin (I don't think this matters)
- On the initiator system there is a 512k block size fio load against 48 NMF subsystems (2 target apps with 24 subsystems)
- When I kill the SPDK target and restart it, then I occasionally get this EINVAL on one of the queue pairs

It's unclear to me why the write call is retuning EINVAL. The file descriptor should be valid since I see the same fd in later qpair creation requests.

Any insights are appreciated.

-- Michael

Maybe the cm is in a state that cannot do init_qp_attr? Do we know what is QP state and cm state (need to do sniffer to check what is the last received/sent CM packet). The file descriptor should be irrelevant.
If able to debug kernel maybe debug this function:
  drivers/infiniband/core/cma.c::rdma_init_qp_attr()
to see where this EINVAL is returned and why.





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux