Hi Joao, Since mlx5 supported devices can do DMA with 64 bit addresses we start like this. This fails in your system since it is not capable of handling 64 bit addresses so we fall back to 32 bit addresses which then succeed. However what you are experiencing is the driver executed a command and firmware supposedly does not respond. Most likely the firmware responded but the driver could not see it due to problems related to dma addresses in your system. Long story short, there is a problem in your system. To investigate this further you might need heavy tools such as pcie analyzer. -----Original Message----- From: linux-rdma-owner@xxxxxxxxxxxxxxx [mailto:linux-rdma-owner@xxxxxxxxxxxxxxx] On Behalf Of Joao Pinto Sent: Tuesday, May 9, 2017 12:13 PM To: linux-rdma@xxxxxxxxxxxxxxx Subject: mlx5 endpoint driver problem Hello, I am making tests with a Mellanox MLX5 Endpoint, and I am getting kernel hangs when trying to enable the hca: mlx5_core 0000:01:00.0: enabling device (0000 -> 0002) mlx5_core 0000:01:00.0: Warning: couldn't set 64-bit PCI DMA mask mlx5_core 0000:01:00.0: Warning: couldn't set 64-bit consistent PCI DMA mask mlx5_core 0000:01:00.0: firmware version: 16.19.21102 INFO: task swapper:1 blocked for more than 10 seconds. Not tainted 4.11.0-BETAMSIX1 #51 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. swapper D 0 1 0 0x00000000 Stack Trace: __switch_to+0x0/0x94 __schedule+0x1da/0x8b0 schedule+0x26/0x6c schedule_timeout+0x2da/0x380 wait_for_completion+0x92/0x104 mlx5_cmd_exec+0x70e/0xd60 mlx5_load_one+0x1b4/0xad8 init_one+0x404/0x600 pci_device_probe+0x122/0x1f0 really_probe+0x1ac/0x348 __driver_attach+0xa8/0xd0 bus_for_each_dev+0x3c/0x74 bus_add_driver+0xc2/0x184 driver_register+0x50/0xec init+0x40/0x60 (...) Stack Trace: __switch_to+0x0/0x94 __schedule+0x1da/0x8b0 schedule+0x26/0x6c schedule_timeout+0x2da/0x380 wait_for_completion+0x92/0x104 mlx5_cmd_exec+0x70e/0xd60 mlx5_load_one+0x1b4/0xad8 init_one+0x404/0x600 pci_device_probe+0x122/0x1f0 really_probe+0x1ac/0x348 __driver_attach+0xa8/0xd0 bus_for_each_dev+0x3c/0x74 bus_add_driver+0xc2/0x184 driver_register+0x50/0xec init+0x40/0x60 mlx5_core 0000:01:00.0: wait_func:882:(pid 1): ENABLE_HCA(0x104) timeout. Will cause a leak of a command resource mlx5_core 0000:01:00.0: enable hca failed mlx5_core 0000:01:00.0: mlx5_load_one failed with error code -110 mlx5_core: probe of 0000:01:00.0 failed with error -110 Could you give me a clue of what might be happennig? Thanks, Joao -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html ��.n��������+%������w��{.n�����{���fk��ܨ}���Ơz�j:+v�����w����ޙ��&�)ߡ�a����z�ޗ���ݢj��w�f