On Wed 14 Aug 2019 at 13:53, Dan Carpenter <dan.carpenter@xxxxxxxxxx> wrote: > [ I already wrote this email, but it looks like I deleted it instead of > sending it. So weird. I hopefully don't send it twice! ] > > Hi Vlad, > > I noticed a possible refcounting bug in commit 948993f2beeb ("net/mlx5e: > Extend encap entry with reference counter") from Jun 3, 2018. > > drivers/net/ethernet/mellanox/mlx5/core/en_tc.c:1435 mlx5e_tc_update_neigh_used_value() > error: dereferencing freed memory 'e' Hi Dan, Thanks for reporting! > > drivers/net/ethernet/mellanox/mlx5/core/en_tc.c > > 1415 void mlx5e_tc_update_neigh_used_value(struct mlx5e_neigh_hash_entry *nhe) > 1416 { > 1417 struct mlx5e_neigh *m_neigh = &nhe->m_neigh; > 1418 struct mlx5e_tc_flow *flow; > 1419 struct mlx5e_encap_entry *e; > 1420 struct mlx5_fc *counter; > 1421 struct neigh_table *tbl; > 1422 bool neigh_used = false; > 1423 struct neighbour *n; > 1424 u64 lastuse; > 1425 > 1426 if (m_neigh->family == AF_INET) > 1427 tbl = &arp_tbl; > 1428 #if IS_ENABLED(CONFIG_IPV6) > 1429 else if (m_neigh->family == AF_INET6) > 1430 tbl = &nd_tbl; > 1431 #endif > 1432 else > 1433 return; > 1434 > 1435 list_for_each_entry_safe(e, tmp, &nhe->encap_list, encap_list) { > 1436 struct encap_flow_item *efi, *tmp; > 1437 > 1438 if (!(e->flags & MLX5_ENCAP_ENTRY_VALID) || > 1439 !mlx5e_encap_take(e)) > ^^^^^^^^^^^^^^^^^^^ > We take a reference here. Okay, we take reference to encap at the beginning of outer loop. > > 1440 continue; > 1441 > 1442 list_for_each_entry_safe(efi, tmp, &e->flows, list) { > 1443 flow = container_of(efi, struct mlx5e_tc_flow, > 1444 encaps[efi->index]); > 1445 if (IS_ERR(mlx5e_flow_get(flow))) > 1446 continue; Now we take reference to flow at the beginning of the inner loop. > 1447 > 1448 if (mlx5e_is_offloaded_flow(flow)) { > 1449 counter = mlx5e_tc_get_counter(flow); > 1450 lastuse = mlx5_fc_query_lastuse(counter); > 1451 if (time_after((unsigned long)lastuse, nhe->reported_lastuse)) { > 1452 mlx5e_flow_put(netdev_priv(e->out_dev), flow); > 1453 neigh_used = true; > 1454 break; > > I think we need to call mlx5e_encap_put(netdev_priv(e->out_dev), e); > before this break; We are breaking from the inner loop (not returning from the function, just breaking the innermost loop) here after releasing the reference to flow which was obtained at the beginning of the loop. > > 1455 } > 1456 } > 1457 > 1458 mlx5e_flow_put(netdev_priv(e->out_dev), flow); > 1459 } > 1460 > 1461 mlx5e_encap_put(netdev_priv(e->out_dev), e); > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Now we are releasing the reference to encap, both when inner loop completed normally or when it was terminated with 'break'. If we were to release the encap before the break, this would be the second time we 'put' it. What am I missing? > 1462 if (neigh_used) > 1463 break; > 1464 } > 1465 > > regards, > dan carpenter