On Mon, 5 Jul 2004 09:34:43 +0200 Harald Welte <laforge@gnumonks.org> wrote: > 1) arp_bind_neighbour fails for some reason > 2) the 'emergency' rt_garbage_collect with zero min_interval and > elasticity 1 takes place. I guess the assumption is that cleaning > the routing cache will drop references to the neighbour table and > thus clean it somehow. > 3) still the next arp_bind_neighbour fails. > > So the question is, why does it fail? It calls __neigh_lookup_errno() > and further on either neigh_loopup() or neigh_create() fail. Especially > in neigh_create there are several error cases (n->parms->neigh_setup() > or tbl->constructor() failing) that might make it return an eror. > > So the blind assumption (and error message) that the neighbour table > might be full seems a bit broad. That's true. But it is unlikely that something other than a SLAB_ATOMIC failure is occuring, at least on ethernet. This is because: 1) The arp_tbl constructor method only fails if there if in_dev_get(dev) fails. If we have a route via that device for which we're arping, in_dev_get() should always succeed. 2) There is no n->parms->neigh_setup method used for ethernet devices, so that failure cannot happen either. 3) You said that you set tbl->gc_thresh{2,3} to extremely large values, so those tests at the top of neigh_alloc() should not fail either. The only thing left is the SLAB_ATOMIC allocation. But if I were you I'd double check that gc_thresh{2,3} stuff. Maybe the sysctl you're setting is not truly propagating. > Expect more news later this week. Any progress? :-) - : send the line "unsubscribe linux-net" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html