Re: Issue with IB/ipoib: Remove device when one port fails to init

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Nov 28, 2017 at 02:00:12PM -0700, Jason Gunthorpe wrote:
> On Tue, Nov 28, 2017 at 09:03:46PM +0200, Yuval Shaia wrote:
>
> > I agree that patch as it is now does not really handle the case where one
> > port fails so it needs to be fixed.
> >
> > The thing is that from your perspective the idea itself is wrong, i.e. if
> > one (of for example two ports) fails the driver needs to continue and serve
> > the other port and just print error message.
>
> On this point, I think if ports are completely independent at the ipoib
> layer then they should not become linked during the add process.
>
> ie if a port is working and a second port fails then it should not
> kill the first port.
>
> However, it is unfortunate we have no recovery from this case at all.
>
> Alex V: However, why is the current behavior a problem? Is this
> because of a dual port card with IB and ROCE concurrently? And the
> add 'fails' the ROCE port even though it isn't even really a failure?
> We certainly shouldn't print in that case..

It is a problem for one port cards too, i see such print on my system:
root@mtr-leonro:~# dmesg |grep Fail
[    7.785329] Failed to init port, removing it
root@mtr-leonro:~# /mnt/iproute2/rdma/rdma link
1/1: mlx5_0/1: subnet_prefix fe80:0000:0000:0000 lid 13399 sm_lid 49151
lmc 0 state ACTIVE physical_state LINK_UP
2/1: mlx5_1/1: subnet_prefix fe80:0000:0000:0000 lid 13400 sm_lid 49151
lmc 0 state ACTIVE physical_state LINK_UP
3/1: mlx5_2/1: subnet_prefix fe80:0000:0000:0000 lid 13401 sm_lid 49151
lmc 0 state ACTIVE physical_state LINK_UP
4/1: mlx5_3/1: state DOWN physical_state DISABLED
5/1: mlx5_4/1: subnet_prefix fe80:0000:0000:0000 lid 13403 sm_lid 49151
lmc 0 state ACTIVE physical_state LINK_UP

Thanks

>
> Jason
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux