On Tue, Jun 7, 2016 at 2:59 AM, Doug Ledford <dledford@xxxxxxxxxx> wrote: > On 6/5/2016 5:10 PM, Or Gerlitz wrote: > >>> - neigh->alive = jiffies; >>> + >>> + if (likely(skb_queue_len(&neigh->queue) < IPOIB_MAX_PATH_REC_QUEUE)) >>> + neigh->alive = jiffies; >> >> why testing the queue len makes things right? > > I'm with Or on this one. This test doesn't make sense unless you assume > that the neighbor is being hit with a steady stream of packets. If > someone just runs ping -c 1 to a dead neighbor, this test will fail. > The idea of the patch is OK, but the test needs to be reworked for > situations other than a blasting stream of data. > I will try to explain the idea behind that test, and why it works (I hope I'll make it clear this time :)) there are 2 objects that are taking place here, the garbage-collector that removes neigh if was not used for defined time, and the path-query for each neigh. if the path-query failed the packets for specific neigh are kept in the neigh queue, but only if there are no more than IPOIB_MAX_PATH_REC_QUEUE, and the neigh is still kept for ever (let's assume CM connected that was stopped without any notification with DREQ etc.) The only way to know that there is an unresolved neigh is by checking its queue, if it is full we can assume that this is an unresolved neigh otherwise it is resolved. So, the test doesn't take any assumption on the shape of the traffic, even in a ping requests once a second it will failed after 3 times the request was sent, and the neigh time will not be updated which will leave it to the garbage-collector to delete it, that gives the driver the possibility to recreate the neigh and to update its ah/pr, otherwise the driver will try over and over to resend to that unresolved neigh. > -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html