On 11/26/2024 3:38 AM, Leon Romanovsky wrote:
On Mon, Nov 25, 2024 at 07:54:43PM +0100, Francesco Poli wrote:
On Thu, 21 Nov 2024 11:04:13 +0100 Uwe Kleine-König wrote:
[...]
It looks like the commit that is biting you is
https://git.kernel.org/linus/50660c5197f52b8137e223dc3ba8d43661179a1d
So if you bisect, try 50660c5197f52b8137e223dc3ba8d43661179a1d and its
parent 24943dcdc156cf294d97a36bf5c51168bf574c22 first.
I started to bisect.
The first surprise is that 50660c5197f52b8137e223dc3ba8d43661179a1d is
good... :-o
It is good news, as I looked on it all that time from the day Uwe
reported it.
<...>
I will try to continue to bisect by testing the resulting kernels on a
compute node: there's no OpenSM there and it cannot run anyway, if
there's another OpenSM on the same InfiniBand network.
However, I can check whether those issm* symlinks are created in
/sys/class/infiniband_mad/
I really hope that this is enough to pinpoint the first bad
commit...
Yes, these symlinks should be there. Your test scenario is correct one.
Any better ideas?
I think that commit: 2a5db20fa532 ("RDMA/mlx5: Add support to multi-plane device and port")
is the one which is causing to troubles, which leads me to suspect FW.
Yes looks like FW reports vport.num_plane > 0. What is your hw type and
FW version ("ethtool -i <netdev_of_the_ibdev>")? I don't think it
supports multiplane.