【A question about kernel verbs API】

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi, Jason Gunthorpe & Leon Romanovsky& Parav Pandit
   Recently when I was learning kernel ofed code, I found an interesting thing about verbs, the implementation rely on
roce driver,  taking ib_dereg_mr for example.

When the driver returns error, the reference count of rdma resource(pd, mr, etc.) won't be decreased. I worried that it
will cause a memory leak.

int ib_dereg_mr(struct ib_mr *mr)
{
	struct ib_pd *pd = mr->pd;
	struct ib_dm *dm = mr->dm;
	int ret;

	rdma_restrack_del(&mr->res);
	ret = mr->device->ops.dereg_mr(mr);
	if (!ret) {
		atomic_dec(&pd->usecnt);
		if (dm)
			atomic_dec(&dm->usecnt);
	}

	return ret;
}

and it is even worse in ib_destroy_qp, rdma_put_gid_attr is not called, so that the net_device will be hold all the time, and caused the
pcie remove scenario hanging.

The following code is:
int ib_destroy_qp(struct ib_qp *qp)
{
	const struct ib_gid_attr *alt_path_sgid_attr = qp->alt_path_sgid_attr;
	const struct ib_gid_attr *av_sgid_attr = qp->av_sgid_attr;
	struct ib_pd *pd;
	struct ib_cq *scq, *rcq;
	struct ib_srq *srq;
	struct ib_rwq_ind_table *ind_tbl;
	struct ib_qp_security *sec;
	int ret;

	WARN_ON_ONCE(qp->mrs_used > 0);

	if (atomic_read(&qp->usecnt))
		return -EBUSY;

	if (qp->real_qp != qp)
		return __ib_destroy_shared_qp(qp);

	pd   = qp->pd;
	scq  = qp->send_cq;
	rcq  = qp->recv_cq;
	srq  = qp->srq;
	ind_tbl = qp->rwq_ind_tbl;
	sec  = qp->qp_sec;
	if (sec)
		ib_destroy_qp_security_begin(sec);

	if (!qp->uobject)
		rdma_rw_cleanup_mrs(qp);

	rdma_restrack_del(&qp->res);
	ret = qp->device->ops.destroy_qp(qp);
	if (!ret) {
		if (alt_path_sgid_attr)
			rdma_put_gid_attr(alt_path_sgid_attr);
		if (av_sgid_attr)
			rdma_put_gid_attr(av_sgid_attr);
		if (pd)
			atomic_dec(&pd->usecnt);
		if (scq)
			atomic_dec(&scq->usecnt);
		if (rcq)
			atomic_dec(&rcq->usecnt);
		if (srq)
			atomic_dec(&srq->usecnt);
		if (ind_tbl)
			atomic_dec(&ind_tbl->usecnt);
		if (sec)
			ib_destroy_qp_security_end(sec);
	} else {
		if (sec)
			ib_destroy_qp_security_abort(sec);
	}

	return ret;
}


My question is what the condieration about the  resource destroy mechanism when roce driver returns error?
Or I understand it wrong?

Thanks
Lijun Ou




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux