On Fri, 2018-01-26 at 06:33 -0800, Dennis Dalessandro wrote: > From: Alex Estrin <alex.estrin@xxxxxxxxx> > > On reboot SM can program port pkey table before ipoib registered its > event handler, which could result in missing pkey event and leave root > interface with initial pkey value from index 0. > > Since OPA port starts with invalid pkey in index 0, root interface will > fail to initialize and stay down with no-carrier flag. > > For IB ipoib interface may end up with pkey different from value > opensm put in pkey table idx 0, resulting in connectivity issues > (different mcast groups, for example). > > Close the window by calling event handler after registration > to make sure ipoib pkey is in sync with port pkey table. > > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx> > Reviewed-by: Ira Weiny <ira.weiny@xxxxxxxxx> > Signed-off-by: Alex Estrin <alex.estrin@xxxxxxxxx> > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx> > --- > drivers/infiniband/ulp/ipoib/ipoib_main.c | 3 +++ > 1 files changed, 3 insertions(+), 0 deletions(-) > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > index 5930c7d..161ba8c 100644 > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > @@ -2306,6 +2306,9 @@ void ipoib_set_dev_features(struct ipoib_dev_priv *priv, struct ib_device *hca) > priv->ca, ipoib_event); > ib_register_event_handler(&priv->event_handler); > > + /* call event handler to ensure pkey in sync */ > + queue_work(ipoib_workqueue, &priv->flush_heavy); > + This seems like a bit of a sledgehammer to the issue. Looking through ipoib_add_port(), the real race is that we have to call ib_query_pkey() early in the init sequence as some of the later steps need it to be set (ipoib_dev_init() must have it already set for one), but since we don't setup our event handler until after we've finished setting up the device, there is that window from our first ib_query_pkey call until we complete the ib_register_event_handler() call for the pkey to change. Instead of throwing the flush regardless, it might be nicer to do: { u16 new_pkey; ib_query_pkey(hca, port, 0, &new_pkey); if (priv->pkey != (new_pkey | 0x8000)) /* The pkey changed between when we * read it and now, flush the device */ queue_work(ipoib_workqueue, &priv->flush_heavy); } > result = register_netdev(priv->dev); > if (result) { > pr_warn("%s: couldn't register ipoib port %d; error %d\n", > -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Attachment:
signature.asc
Description: This is a digitally signed message part