On Mon, Jun 26, 2017 at 01:24:11PM -0400, Chuck Lever wrote: > Running various I/O stress workloads with iozone on an > NFSv3 mount using RDMA on RoCEv1 (FRWR). > > Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: command 0x49 timed out (go bit not cleared) It means that device had internal error before and/or pci channel is offline and is restarting now. > Jun 26 12:50:21 morisot kernel: mlx4_core 0000:01:00.0: device is going to be reset > Jun 26 12:50:22 morisot kernel: mlx4_core 0000:01:00.0: device was reset successfully > Jun 26 12:50:22 morisot kernel: mlx4_en 0000:01:00.0: Internal error detected, restarting device > Jun 26 12:50:22 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error was started > Jun 26 12:50:22 morisot kernel: <mlx4_ib> mlx4_ib_handle_catas_error: mlx4_ib_handle_catas_error ended > Jun 26 12:50:22 morisot kernel: ib_srpt received unrecognized IB event 8 Thanks
Attachment:
signature.asc
Description: PGP signature