Re: [PATCH rdma-rc 1/2] RDMA/restrack: Add ability to create non-traceable restrack objects

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 20, 2018 at 07:30:08PM -0600, Jason Gunthorpe wrote:
> On Wed, Mar 21, 2018 at 12:11:59AM +0000, Parav Pandit wrote:
>
> > > > Right way to do ib core to have open() and close() callback.
> > >
> > > Probably.. But since we already accepted the patches that cause this bug for this
> > > merge window we may be stuck accepting a hack.
> > >
> > > The hack is to let mlx5 opt out of resource tracking for it's internal objects.
>
> > There are some patches from Steve that I didn't review deeply, but
> > those patches let some internal data structure of the provider
> > driver to be exposed to user for debugging purpose, which is very
> > useful.
> >
> > Given UMR QP is one such core QP that might need debugging as well
> > in future and some other internal UD QPs that Bodong is working on.
> > Given that it might be useful to debug such internal QP as
> > well. Having them resource tracked is useful instead of opting out.
> > So not tracking them further creates blockers to not able to debug
> > them in future.
>
> Yes, that seems to be likely. Someone will ultimately have to add new
> methods and revise mlx5. It really shouldn't be creating resources
> prior to registering..

There is main difference between Steve's patches and UMR flows. Steve is
exposing extra information for standard QPs which are created by ib_core.

UMR QPs are something different, they are created by mlx5_ib and ib_core
doesn't aware of them, for example their type is something different
from known to rdmatool and ib_core.

Mark and me discussed the solution to the current situation and proposed
current patch. We are all agree that UMR flow needs to be revisited and
ideal solution will be create symmetrical create/destroy UMR flows.
However we don't have clear vision how to do it in -rc6.

This is why it is safe for now to skip UMR completely from restrack.

>
> > > At least that way we still protect the ULPs on other non-broken drivers.
> > Not sure if any other drivers have dangling resources after ib_unregister_device() is called.
> > Otherwise Leon' patch would have notrack() API invoked in other provider drivers too.
>
> It is more the ULPs that worry me. ULPs leaving dangling resources
> after client unregister seems kind of likely.
>
> > > Maybe restrack_init should be moved to the register as well in this patch?
>
> > That alleast avoids the confusing code.

I may admit that names are confusing, but the place is right,
ib_alloc_device is responsible to initialize various structures and this
is exactly what restrack_init does.

The rdma_restrack_clean() will be renamed to something like
rdma_restrack_check_leaks() in near future, because I see a need to
rewrite the printed output from that function to make debug more sane.

>
> What if we add this hunk to the patch? Leon?

In -rc6? No way.

I'll add it after I'll rewrite rdma_restrack_clean(), so people won't
get scary messages without ability to debug.

Thanks

>
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 0ab99e62cc5ce0..247f2be0a1516b 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -464,6 +464,12 @@ int ib_register_device(struct ib_device *device,
>  	struct ib_udata uhw = {.outlen = 0, .inlen = 0};
>  	struct device *parent = device->dev.parent;
>
> +	/*
> +	 * Nothing is permitted to create objects prior to calling
> +	 * register_device.
> +	 */
> +	rdma_restrack_clean(&device->res);
> +
>  	WARN_ON_ONCE(device->dma_device);
>  	if (device->dev.dma_ops) {
>  		/*
>
> Jason

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux