Re: [PATCH rdma-next v7 0/8] RDMA resource tracking

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Jan 30, 2018 at 09:47:48PM +0000, Bart Van Assche wrote:
> On Tue, 2018-01-30 at 14:42 -0700, Jason Gunthorpe wrote:
> > On Tue, Jan 30, 2018 at 09:40:14PM +0000, Bart Van Assche wrote:
> > > On Tue, 2018-01-30 at 16:33 -0500, Laurence Oberman wrote:
> > > > Can I take your tree and see if this fails for me too,
> > > > Your last tree was fine, so did not have this latest stuff.
> > > > Can I just pull to what I have
> > > 
> > > Hello Laurence,
> > > 
> > > So far I have seen this behavior only inside a VM but not yet on a system
> > > with more memory than the VM. This issue may be specific to the memory size
> > > of the VM. I think we should try to isolate furhter what caused this before
> > > trying to reproduce it on more setups.
> > 
> > Did you get an oops print related a kalloc failure?
> > 
> > Or am I wrong and the ENOMEM is coming from someplace else?
> 
> Hello Jason,
> 
> I just noticed the following in the system log:
> 
> Jan 30 12:53:15 ubuntu-vm kernel: ib_srp: rxe0: ib_alloc_mr() failed. Try to reduce max_cmd_per_lun, max_sect or ch_count
> 
> So apparently the ib_alloc_mr() fails sometimes (but not the first few times
> it is called).

Looks like the only way you can get that without hitting an kalloc
oops print is if rxe_alloc() fails, and probably here:

	if (atomic_inc_return(&pool->num_elem) > pool->max_elem)
		goto out_put_pool;

Suggesting srp hit the max # of mrs in rxe:

	RXE_MAX_MR			= 2 * 1024,

Or maybe we are now leaking mrs someplace?

There is nothing accepted recently that mucks with this, still not
seeing even a tenuous connection to any patches in the last few days

What was accepted in the past week(s) was a bunch of srp stuff
though:

$ git diff --stat 052eac6eeb5655c52a490a49f09c55500f868558
 MAINTAINERS                                  |   3 +-
 drivers/infiniband/core/Makefile             |   2 +-
 drivers/infiniband/core/cm.c                 |   6 +-
 drivers/infiniband/core/cma.c                |   2 +-
 drivers/infiniband/core/core_priv.h          |  28 ++++
 drivers/infiniband/core/cq.c                 |  16 ++-
 drivers/infiniband/core/device.c             |   4 +
 drivers/infiniband/core/nldev.c              | 374 ++++++++++++++++++++++++++++++++++++++++++++++++++
 drivers/infiniband/core/restrack.c           | 164 ++++++++++++++++++++++
 drivers/infiniband/core/user_mad.c           |   2 +-
 drivers/infiniband/core/uverbs_cmd.c         |   7 +-
 drivers/infiniband/core/uverbs_ioctl.c       |  19 ++-
 drivers/infiniband/core/uverbs_std_types.c   |   3 +
 drivers/infiniband/core/verbs.c              |  17 ++-
 drivers/infiniband/hw/mlx4/cq.c              |   4 +-
 drivers/infiniband/hw/mlx5/cq.c              |   2 +-
 drivers/infiniband/hw/mlx5/mlx5_ib.h         |   4 +-
 drivers/infiniband/hw/mlx5/qp.c              |   5 +-
 drivers/infiniband/hw/mthca/mthca_memfree.c  |   2 +-
 drivers/infiniband/hw/mthca/mthca_user.h     | 112 ---------------
 drivers/infiniband/hw/qedr/verbs.c           |   6 +-
 drivers/infiniband/hw/qib/qib_keys.c         | 235 -------------------------------
 drivers/infiniband/sw/rxe/Kconfig            |   4 +-
 drivers/infiniband/ulp/iser/iser_initiator.c |  16 +--
 drivers/infiniband/ulp/srp/ib_srp.c          | 723 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------------------
 drivers/infiniband/ulp/srp/ib_srp.h          |  43 +++++-
 drivers/infiniband/ulp/srpt/ib_srpt.c        |   2 -
 include/rdma/ib_verbs.h                      |  39 ++++--
 include/rdma/restrack.h                      | 157 +++++++++++++++++++++
 include/scsi/srp.h                           |  17 +++
 include/uapi/rdma/ib_user_verbs.h            |   7 +-
 include/uapi/rdma/rdma_netlink.h             |  49 +++++++
 lib/kobject.c                                |   2 +
 33 files changed, 1511 insertions(+), 565 deletions(-)

Any chance one of the SRP patches got mishandled somehow??

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux