Re: Regression: Connect-X5 doesn't connect with NVME-of

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Logan,

On 2/3/2018 6:53 AM, Saeed Mahameed wrote:


On 02/01/2018 09:56 AM, Logan Gunthorpe wrote:
Hello,

We've experienced a regression with using nvme-of and two Connect-X5s. With v4.15 and v4.14.16 we see the following dmesgs when trying to connect to the target:

I would like to repro it in our labs so please describe the environment and the topology you run (B2B/switch/loopback ?)


[   43.732539] nvme nvme2: creating 16 I/O queues.
[   44.072427] nvmet: adding queue 1 to ctrl 1.
[   44.072553] nvmet: adding queue 2 to ctrl 1.
[   44.072597] nvme nvme2: Connect command failed, error wo/DNR bit: -16402
[   44.072609] nvme nvme2: failed to connect queue: 3 ret=-18
[   44.075421] nvmet_rdma: freeing queue 2
[   44.075792] nvmet_rdma: freeing queue 1
[   44.264293] nvmet_rdma: freeing queue 3
*snip*

(on v4.15 there is additional error panics likely do to some other nvme-of error handling bugs)

I fixed the panic during connect error flow by fixing the state machine in the NVME core.
It should be pushed to 4.16-rc and I hope to 4.15.x soon.


And nvme connect returns:

Failed to write to /dev/nvme-fabrics: Invalid cross-device link

The two adapters are the same with the latest available firmware:

     transport:            InfiniBand (0)
     fw_ver:                16.21.2010
     vendor_id:            0x02c9
     vendor_part_id:            4119
     hw_ver:                0x0
     board_id:            MT_0000000010

We bisected to find the commit that broke our setup is:

05e0cc84e00c net/mlx5: Fix get vector affinity helper function

I doubt that the issue is within this fix itself, but with this fix the Automatic affinity settings
for nvme over rdma is enabled, Maybe a bug was hiding there and we just stepped on it.

Added Sagi, maybe he can help us spot the issue here.

Thanks,
saeed.

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux