RE: cqe dump errors on target while running nvme-of large block read IO

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> These errors are either from:
> 1. mapping error on the host side - not sure given we don't see any error
> completions/events from the rdma device. However, can you turn on dynamic
> debug to see QP events?
> 
> echo "func nvme_rdma_qp_event +p" >
> /sys/kernel/debug/dynamic_debug/control

Yes, I can try this out.  Will this just print to dmesg or do I need to collect a log from somewhere?

> The fact that null_blk didn't reproduce this was probably because it is less
> bursty (which can cause network congestion).

See email I just now sent in reply to Max (this same thread).  I believe we reproduced the same issue with null_blk last night after correctly configuring some latency into the null_blk devices.

> Joseph, are you sure that flow control is correctly configured and working
> reliably?

I believe it is set up correctly.  Running ethtool against the NIC interfaces in use reports:
	Supported pause frame use: Symmetric Receive-only

And all ports in use on the Arista 7060X switch report it turned on in both directions:
	flowcontrol send on
	flowcontrol receive on

If there's anywhere else we can check, or any direct test of flow control we can run, happy to try it.  Should we be OK with only Rx flow control at the NIC (this seems to be the default behavior) or is it recommended to set up Tx flow control as well?
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux