On Tue, Oct 6, 2015 at 7:05 PM, Doug Ledford <dledford@xxxxxxxxxx> wrote: > I'll have some sort of answer for that soon. I spent the better part of > last week, and what time I worked on the weekend, plus all day yesterday > on the internal infrastructure here at Red Hat. We're experiencing some > growing pains in our cluster and some downtime as a result that keeps me > from being able to test code effectively. I wouldn't be surprised if it > takes another day or two to get it completely sorted out (or sorted as > best I can, some things are out of my control). Then I have to see if > any of the currently posted fixes for 4.3rc that I haven't grabbed yet > resolve the iSER issue I'm seeing, then I'll move on to for-next processing. Doug, >From my experience with VPI (IB/RoCE) clusters, librdmacm/rping is the answer... namely -- if you have **rping** up and running over kernel X for both IB and RoCE, things aren't in such a bad state. If you want to go deeper, have it working over IB non-default partition and Ethernet VLAN. Also, for IB multicast, mckey with IPoIB port space, iperf multicast over IPoIB would tell you how things are. So all to all, sans SRIOV, it should take you whole 20m to figure out if something is really DOA over IB/RoCE HW and I believe iWARP too (rping) - makes sense? What we do know that needs fixing for 4.3-rc --> RoCE, you need the patch re-posted by Haggai few hours ago "IB/cma: Accept connection without a valid netdev on RoCE" -- without it, RoCE isn't working. --> **mlx5** devices and no-default IB pkeys, Haggai and Co are working on a fix since this isn't working since 4.3-rc1. I told them we need it till rc5.5 (i.e few days before rc6 and if not, will have to revert some 4.3-rc1 bits. Or. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html