On Fri, 2018-05-11 at 16:12 +0200, Evgenii Smirnov wrote: > Yes, in this case we are getting CLIENT_REREGISTER event and will do > light flush, which marks all path records as invalid and reconnects all > multicast groups. I think it would be too much to do a normal flush, as > all interfaces in the network will go down on a simple subnet manager > change. > > Light flush seems to be suitable for this case, the problem is that > validity of path records is checked only on outgoing unicast ARPs. My > impression is that correct solution that also covers UD case is to check > validity also on outgoing ICMP6 Neighbor Advertisement packets. I can > try to prepare a patch that implements that. > > This patch, however, still looks useful to me, as checking path on neigh > creation allows to detect such situations in connected mode faster, not > waiting for ARP cache expiry time. I don't want us waiting on any expiry time. That's just the wrong answer here. If we know that the SM changed, and that an SM change can reder our paths invalid by reassigning LIDs, then we need to be proactive in resolving that our paths are valid. Waiting for an expiry timer is just the wrong thing to do IMO. > > On 09.05.2018 16:31, Doug Ledford wrote: > > On Fri, 2018-05-04 at 13:49 +0200, Evgenii Smirnov wrote: > > > Currently, the validity of a path is checked only on > > > unicast ARP transmission. If Subnet Manager switchover happens and > > > some LIDs get reassigned, driver in a network that uses only IPv6 > > > addresses will not try to renew the path records, despite them > > > being marked as invalid. > > > > Are we not getting a flush event in this case? If we are getting a > > flush event, maybe we just aren't doing a heavy enough flush? > > > > In general I don't have a problem for this patch, but I would prefer to > > find a solution that resolves the UD case too, and maybe that just needs > > to flush harder on the specific event we get when we get a new SM (it's > > a rereg event, yes?). > > > > > In connected mode, remote side LID change will cause send to fail, > > > freeing the corresponding neigh struct. Subsequent packets to this > > > destination will trigger allocation of a new neigh struct. > > > > > > With this patch allocation of new neigh struct will also check the > > > validity of the associated path and renew it if necessary. > > > > > > This, however, will not help in datagram mode, if the host > > > continuously sends data to the destination with invalid path. > > > The neigh struct alive timer will be updated, thus preventing > > > it from reallocation. > > > > > > Test setup consists of two target hosts and two initiator hosts, > > > one of the initiators is with the patch. All hosts have only IPv6 > > > addresses from the same subnet and initiators constantly ping targets. > > > In connected mode swapping the LIDs of target hosts and switching over SM > > > leads to the loss of connectivity for the initiator without the patch. > > > Initiator with the patch recovers in ~3 sec. In datagram mode initiator > > > with the patch is able to recover only if ping is stopped for > > > neigh_obsolete time. > > > > > > Signed-off-by: Evgenii Smirnov<evgenii.smirnov@xxxxxxxxxxxxxxxx> > > > --- > > > drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > > index 161ba8c76285..db5762d62aea 100644 > > > --- a/drivers/infiniband/ulp/ipoib/ipoib_main.c > > > +++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c > > > @@ -963,7 +963,7 @@ static struct ipoib_neigh *neigh_add_path(struct sk_buff *skb, u8 *daddr, > > > > > > list_add_tail(&neigh->list, &path->neigh_list); > > > > > > - if (path->ah) { > > > + if (path->ah && path->valid) { > > > kref_get(&path->ah->ref); > > > neigh->ah = path->ah; > > > > > -- Doug Ledford <dledford@xxxxxxxxxx> GPG KeyID: B826A3330E572FDD Key fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD
Attachment:
signature.asc
Description: This is a digitally signed message part