Re: Bug#1086520: linux-image-6.11.2-amd64: makes opensm fail to start

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Dec 04, 2024 at 06:13:56PM +0100, Francesco Poli wrote:
> On Wed, 4 Dec 2024 17:37:05 +0100 Uwe Kleine-König wrote:
> 
> > Hello Francesco,
> 
> Hello Uwe,
> 
> [...]
> > I wonder if you could test a firmware upgrade or the above patch. Would
> > be nice to know if there are still some things to do for us (= Debian
> > kernel team) here.
> 
> Yes, I've finally got around to upgrading the firmware.
> 
> And today I had a time window, where I could reboot the cluster head
> node.
> After the reboot, the InfiniBand network works correctly:
> 
>   $ uname -v
>   #1 SMP PREEMPT_DYNAMIC Debian 6.11.10-1 (2024-11-23)
>   $ ls -altrF /sys/class/infiniband_mad/
>   total 0
>   lrwxrwxrwx  1 root root    0 Dec  4 10:15 umad0 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.0/infiniband_mad/umad0/
>   lrwxrwxrwx  1 root root    0 Dec  4 10:15 umad1 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.1/infiniband_mad/umad1/
>   drwxr-xr-x  2 root root    0 Dec  4 10:17 ./
>   drwxr-xr-x 73 root root    0 Dec  4 10:17 ../
>   -r--r--r--  1 root root 4096 Dec  4 10:17 abi_version
>   lrwxrwxrwx  1 root root    0 Dec  4 18:08 issm1 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.1/infiniband_mad/issm1/
>   lrwxrwxrwx  1 root root    0 Dec  4 18:08 issm0 -> ../../devices/pci0000:80/0000:80:01.1/0000:81:00.0/infiniband_mad/issm0/
>   # ethtool -i ibp129s0f0
>   driver: mlx5_core[ib_ipoib]
>   version: 6.11.10-amd64
>   firmware-version: 20.43.1014 (MT_0000000224)
>   expansion-rom-version:
>   bus-info: 0000:81:00.0
>   supports-statistics: yes
>   supports-test: yes
>   supports-eeprom-access: no
>   supports-register-dump: no
>   supports-priv-flags: yes
>   # ethtool -i ibp129s0f1
>   driver: mlx5_core[ib_ipoib]
>   version: 6.11.10-amd64
>   firmware-version: 20.43.1014 (MT_0000000224)
>   expansion-rom-version:
>   bus-info: 0000:81:00.1
>   supports-statistics: yes
>   supports-test: yes
>   supports-eeprom-access: no
>   supports-register-dump: no
>   supports-priv-flags: yes
>   $ ps aux | grep opens[m]
>   root        1150  0.0  0.0 1560776 3636 ?        Ssl  10:15   0:00 /usr/sbin/opensm --guid 0x9c63c00300033240 --log_file /var/log/opensm.0x9c63c00300033240.log
> 
> 
> > 
> > If everything is fine for you, I'd like to close this bug.
> 
> I am closing the Debian bug report right now.
> Thanks to everyone who has been involved for the great and kind help!

Thanks a lot for your help. You helped a lot.

BTW, we have an official fix [1], but it wasn't sent yet as we want to
finish all various tests first (E2E, QA e.t.c).

[1] https://git.kernel.org/pub/scm/linux/kernel/git/leon/linux-rdma.git/commit/?h=rdma-next&id=09754c1e5d0d204747928290cc8c6f4371fd4c6a

> 
> > 
> > Best regards
> 
> Have a nice evening.   :-)
> 
> -- 
>  http://www.inventati.org/frx/
>  There's not a second to spare! To the laboratory!
> ..................................................... Francesco Poli .
>  GnuPG key fpr == CA01 1147 9CD2 EFDF FB82  3925 3E1C 27E1 1F69 BFFE






[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux