Re: [PATCH] IB/ipoib: check path validity on allocation of neigh struct

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Yes, in this case we are getting CLIENT_REREGISTER event and will do light flush, which marks all path records as invalid and reconnects all multicast groups. I think it would be too much to do a normal flush, as all interfaces in the network will go down on a simple subnet manager change.

Light flush seems to be suitable for this case, the problem is that validity of path records is checked only on outgoing unicast ARPs. My impression is that correct solution that also covers UD case is to check validity also on outgoing ICMP6 Neighbor Advertisement packets. I can try to prepare a patch that implements that.

This patch, however, still looks useful to me, as checking path on neigh creation allows to detect such situations in connected mode faster, not waiting for ARP cache expiry time.


On 09.05.2018 16:31, Doug Ledford wrote:
On Fri, 2018-05-04 at 13:49 +0200, Evgenii Smirnov wrote:
Currently, the validity of a path is checked only on
unicast ARP transmission. If Subnet Manager switchover happens and
some LIDs get reassigned, driver in a network that uses only IPv6
addresses will not try to renew the path records, despite them
being marked as invalid.
Are we not getting a flush event in this case?  If we are getting a
flush event, maybe we just aren't doing a heavy enough flush?

In general I don't have a problem for this patch, but I would prefer to
find a solution that resolves the UD case too, and maybe that just needs
to flush harder on the specific event we get when we get a new SM (it's
a rereg event, yes?).

In connected mode, remote side LID change will cause send to fail,
freeing the corresponding neigh struct. Subsequent packets to this
destination will trigger allocation of a new neigh struct.

With this patch allocation of new neigh struct will also check the
validity of the associated path and renew it if necessary.

This, however, will not help in datagram mode, if the host
continuously sends data to the destination with invalid path.
The neigh struct alive timer will be updated, thus preventing
it from reallocation.

Test setup consists of two target hosts and two initiator hosts,
one of the initiators is with the patch. All hosts have only IPv6
addresses from the same subnet and initiators constantly ping targets.
In connected mode swapping the LIDs of target hosts and switching over SM
leads to the loss of connectivity for the initiator without the patch.
Initiator with the patch recovers in ~3 sec. In datagram mode initiator
with the patch is able to recover only if ping is stopped for
neigh_obsolete time.

Signed-off-by: Evgenii Smirnov<evgenii.smirnov@xxxxxxxxxxxxxxxx>
---
  drivers/infiniband/ulp/ipoib/ipoib_main.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/infiniband/ulp/ipoib/ipoib_main.c b/drivers/infiniband/ulp/ipoib/ipoib_main.c
index 161ba8c76285..db5762d62aea 100644
--- a/drivers/infiniband/ulp/ipoib/ipoib_main.c
+++ b/drivers/infiniband/ulp/ipoib/ipoib_main.c
@@ -963,7 +963,7 @@ static struct ipoib_neigh *neigh_add_path(struct sk_buff *skb, u8 *daddr,
list_add_tail(&neigh->list, &path->neigh_list); - if (path->ah) {
+	if (path->ah && path->valid) {
  		kref_get(&path->ah->ref);
  		neigh->ah = path->ah;

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux