Re: [PATCH v6 06/13] SIW application interface

Jason Gunthorpe <jgg@xxxxxxxx> · Fri, 29 Mar 2019 13:16:17 -0300



On Fri, Mar 29, 2019 at 03:48:25PM +0000, Bernard Metzler wrote:
> 
> >To: "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>
> >From: "Jason Gunthorpe" <jgg@xxxxxxxx>
> >Date: 03/29/2019 03:04PM
> >Cc: "Leon Romanovsky" <leon@xxxxxxxxxx>, linux-rdma@xxxxxxxxxxxxxxx
> >Subject: Re: [PATCH v6 06/13] SIW application interface
> >
> >On Fri, Mar 29, 2019 at 12:58:08PM +0000, Bernard Metzler wrote:
> >
> >> >> +int siw_query_device(struct ib_device *base_dev, struct
> >> >ib_device_attr *attr,
> >> >> +		     struct ib_udata *unused)
> >> >> +{
> >> >> +	struct siw_device *sdev = to_siw_dev(base_dev);
> >> >> +	/*
> >> >> +	 * A process context is needed to report avail memory
> >resources.
> >> >> +	 */
> >> >> +	if (in_interrupt())
> >> >> +		return -EINVAL;
> >> >
> >> >?????
> >> >
> >> The driver reports the maximum memory size available for reg_mr
> >> reservation via current rlimit(RLIMIT_MEMLOCK). This is valid only
> >> within a process context. So far I am not sure a kernel client
> >wouldn't
> >> try to call us out of an interrupt context.
> >
> >This is completely bogus
> >
> >The test for user/kernel in verbs command is to check 'udata != NULL'
> >(ie that unused parameter there)
> >
> >> Alternatively, siw could just report ~0ull for max_mr_size. But I 
> >> often did hit the case where user land applications tried to
> >register
> >> more memory than allowed per current user credentials, and the
> >devices
> >> 'attr.max_mr_size' was a good indication ulimits should get
> >adapted.
> >
> >All drivers have this problem, you should report device capabilites
> >here, not what the user is permitted to do. Apps can query ulimit
> >separately if they desire.
> 
> Makes sense!
> 
> >
> >> >And please without siw_OBJ_get and siw_OBJ_put, the are not
> >needed.
> >> >If code was called in destroy paths, it means that object is
> >exists
> >> >and ready to be released.
> >> >
> >> 
> >> I understand the rdma mid layer code undergoes a substantial
> >> revision to take responsibility of rdma resources lifetime.
> >> Let me try to get rid of all reference counting where I am sure
> >> I can rely on rdma midlayer. There are cases though, where
> >> I do not see how that should work out: assume a registered
> >> memory region which gets invalidated while referenced by
> >> a current inbound WRITE or outbound READ. The driver must
> >> guarantee the referenced memory region object stays intact,
> >> and that should not require to take a heavy lock on that
> >> object on the fast path.
> >
> >You need some kind of managed concurrancy - you can't just randomly
> >touch a MR without locking. It could be destroyed, or re-reged, or
> >whatever.
> >
> Yes.
> 
> >> Same problem seem to be true for the life time of
> >> a QP. The mid layer cannot easily make sure the QP object
> >> pointer remains valid for all CQE's referencing that
> >> QP. That pointer shall remain valid until the CQE is
> >> polled by the user, or the CQ gets flushed.
> >
> >Nope, we can destroy a QP before a CQ and you have to make this work
> >out. You probably have to lock the CQ and purge the pointers during
> >QP
> >destroy if you are doing a scheme like this.
> >
> 
> A purged QP pointer of NULL in a work completion isn't valid
> or defined, and we do not want all kernel applications to
> test that pointer for NULL anyway.

Well, null will crash if the ULP isn't coded right

> So let me just provide the QP pointer to the application,
> which may reference free'd memory. That's all I can do.

I'm pretty sure the semantic we have in the QP/CQ layer is that the
ULP destroying the QP must move it error and flush out the CQ then
destroy the QP

ie see ib_drain_qp

Jason