On Thu, 2018-05-17 at 16:16 -0400, Doug Ledford wrote: > On Fri, 2018-05-04 at 13:49 +0200, Evgenii Smirnov wrote: > > Currently, the validity of a path is checked only on > > unicast ARP transmission. If Subnet Manager switchover happens and > > some LIDs get reassigned, driver in a network that uses only IPv6 > > addresses will not try to renew the path records, despite them > > being marked as invalid. > > > > In connected mode, remote side LID change will cause send to fail, > > freeing the corresponding neigh struct. Subsequent packets to this > > destination will trigger allocation of a new neigh struct. > > > > With this patch allocation of new neigh struct will also check the > > validity of the associated path and renew it if necessary. > > > > This, however, will not help in datagram mode, if the host > > continuously sends data to the destination with invalid path. > > The neigh struct alive timer will be updated, thus preventing > > it from reallocation. > > > > Test setup consists of two target hosts and two initiator hosts, > > one of the initiators is with the patch. All hosts have only IPv6 > > addresses from the same subnet and initiators constantly ping targets. > > In connected mode swapping the LIDs of target hosts and switching over SM > > leads to the loss of connectivity for the initiator without the patch. > > Initiator with the patch recovers in ~3 sec. In datagram mode initiator > > with the patch is able to recover only if ping is stopped for > > neigh_obsolete time. > > > > Signed-off-by: Evgenii Smirnov <evgenii.smirnov@xxxxxxxxxxxxxxxx> > > --- > > drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > index 161ba8c76285..db5762d62aea 100644 > > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > @@ -963,7 +963,7 @@ static struct ipoib_neigh *neigh_add_path(struct sk_buff *skb, u8 *daddr, > > > > list_add_tail(&neigh->list, &path->neigh_list); > > > > - if (path->ah) { > > + if (path->ah && path->valid) { > > kref_get(&path->ah->ref); > > neigh->ah = path->ah; > > > > OK, looking at this closer, how is this patch even possibly right? > > From what I can see, when we get a REREG or SM_CHANGE even we do a light > flush. With a light flush we do this: > > FLUSH_LIGHT-> > ipoib_mark_paths_invalid > ipoib_mcast_dev_flush > ipoib_flush_ah > > In those three things, we will mark all paths as invalid (but this does > nothing to any ah's that are attached to neigh structs directly), we > flush all the multicast stuff and rejoin, and then the flush_ah only > collects ahs that are already expired, it doesn't mark any to be expired > or mark any as invalid. > > Then, your patch changes neigh_add_path to check path->valid and only > use an existing path if it's marked valid. OK, but that does nothing > for the case where we already have a neighbor and the path is already > attached to the neighbor. In that case, we fall through the switch, > skip neigh_add_path entirely, and move on to the send. It seems to me > that your fix only resolves the one situation where we don't have a > neighbor but we have a path. It doesn't resolve the case where we have > a neighbor and a path that's been marked invalid, or a neighbor with a > direct ah attached to it but no path. > > It would seem to me that the whole mark_paths_invalid is a mostly > useless function. If we need to redo the paths, then we could just as > easily call flush_paths instead, which will get rid of all the paths, > attached to a neighbor or not, and then we will end up looking them all > up again on an as needed basis. That would fix both the neighbor > present with path and neighbor absent situations. Then we would just > need to resolve the neighbor present with ah situation, and for that it > seems we need to modify flush_ah to actually flush the ahs instead of > just collecting the already expired ones. > > Is there something I'm missing here (I admit it's been a while since I > was mucking around in the ipoib transmit code)? > > After spending more time staring at this, I think I see what's going on finally. In a nutshell, all neighbors have paths, all paths have an AH, and once we've established that a path is valid, we shortcut the path out of the picture and put the ah directly on the neigh via the neigh->ah pointer. The problem this creates is that when we want to mark paths invalid, we can't easily deal with the fact that the neigh is holding a direct pointer to the path's ah. So, since the validity is as much on the ah as it is on the path, move path->valid to path->ah->valid so that any time we look up a neigh struct, and neigh->ah has been set, we can check neigh->ah->valid to make sure we should be using it. Then, also, since neigh->ah is used without taking the priv->lock that we use when setting path->ah and its elements, we can't just set neigh- >ah to NULL or we can cause oopses. That means when we have the situation where neigh->ah->valid == 0, we need to refresh our path (which on path rec completion will do a swap of the old ah for a new ah), so add code to refresh the path in this situation. I'm sending a patch under separate email that I think will resolve the situation for all of the needed cases. Can you please test it and let me know if it resolves your issue (and the mentioned UD issue too if you don't mind)? -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Attachment:
signature.asc
Description: This is a digitally signed message part