On Tue, Nov 28, 2017 at 02:00:12PM -0700, Jason Gunthorpe wrote: > On Tue, Nov 28, 2017 at 09:03:46PM +0200, Yuval Shaia wrote: > > > I agree that patch as it is now does not really handle the case where one > > port fails so it needs to be fixed. > > > > The thing is that from your perspective the idea itself is wrong, i.e. if > > one (of for example two ports) fails the driver needs to continue and serve > > the other port and just print error message. > > On this point, I think if ports are completely independent at the ipoib > layer then they should not become linked during the add process. > > ie if a port is working and a second port fails then it should not > kill the first port. > > However, it is unfortunate we have no recovery from this case at all. > > Alex V: However, why is the current behavior a problem? Is this > because of a dual port card with IB and ROCE concurrently? And the > add 'fails' the ROCE port even though it isn't even really a failure? > We certainly shouldn't print in that case.. Per my understanding - no. Alex is referring to a system where a two ports card is running RocE on both, Alex, please correct me if i'm wrong. The current state of ipoib_add_one does not kill the working port on such case, it just print an error message (not a warning). Please review the patch "IB/ipoib: Warn when one port fails to initialize" which fixes it by removing the error message and the call to ipoib_remove_one and adds missing warning message to ipoib_add_port. > > Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html