On 7/8/19 11:47 AM, Dag Moxnes wrote: > Thanks Jason, > > Regards, > Dag > > Den 08.07.2019 19:50, skrev Jason Gunthorpe: >> On Mon, Jul 08, 2019 at 01:16:24PM +0200, Dag Moxnes wrote: >>> Use neighbour lock when copying MAC address from neighbour data struct >>> in dst_fetch_ha. >>> >>> When not using the lock, it is possible for the function to race with >>> neigh_update, causing it to copy an invalid MAC address. >>> >>> It is possible to provoke this error by calling rdma_resolve_addr in a >>> tight loop, while deleting the corresponding ARP entry in another tight >>> loop. >>> >>> This will cause the race shown it the following sample trace: >>> >>> rdma_resolve_addr() >>> rdma_resolve_ip() >>> addr_resolve() >>> addr_resolve_neigh() >>> fetch_ha() >>> dst_fetch_ha() >>> n->nud_state == NUD_VALID >> It isn't nud_state that is the problem here, it is the parallel >> memcpy's onto ha. I fixed the commit message >> >> This could also have been solved by using the ha_lock, but I don't >> think we have a reason to particularly over-optimize this. Sorry I'm late to the party, but why not just use: neigh_ha_snapshot()? >> >>> drivers/infiniband/core/addr.c | 9 ++++++--- >>> 1 file changed, 6 insertions(+), 3 deletions(-) >> Applied to for-next, thanks >> >> Jason > Mark