On Thu, Jan 25, 2018 at 05:12:22PM +0200, Leon Romanovsky wrote: > From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > + > +/** > + * struct rdma_restrack_entry - metadata per-entry > + */ > +struct rdma_restrack_entry { > + /** > + * @valid: validity indicator > + * > + * The entries are filled during rdma_restrack_add, > + * can be attempted to be free during rdma_restrack_del. > + * > + * As an example for that, see mlx5 QPs with type MLX5_IB_QPT_HW_GSI > + */ > + bool valid; > + /* > + * @kref: Protect destroy of the resource > + */ > + struct kref kref; Sticking a kref in a random structure that is itself not malloc'd is a big red flag: @@ -1746,6 +1756,11 @@ struct ib_qp { + struct rdma_restrack_root res; Not seeing how this works at all. qp does this: + rdma_restrack_del(&qp->res); ret = qp->device->destroy_qp(qp); And for instance the mlx5 implementation of destroy_qp() does: int mlx5_ib_destroy_qp(struct ib_qp *qp) { struct mlx5_ib_qp *mqp = to_mqp(qp); [..] kfree(mqp); So we kfree the kref containing memory outside the kref's release function. Super big red flag. rdma_restrack_del is just: + return kref_put(&res->kref, restrack_release); So we don't do anything to block progress to the kfree unless the kref drefs to 0. But the reader holds a get/put across its critical section so we never hit the 0 case when there is another user. Not even remotely right. Adding a kref means it has to be added to struct ib_qp and then kfree's in the drivers need to change to kref_put. That is alot of work. The best idea I've had to make this locking work is what I emailed you a few days ago.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html