RE: [PATCH rdma-next 11/13] RDMA/efa: Add EFA verbs implementation

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




> -----Original Message-----
> From: Doug Ledford <dledford@xxxxxxxxxx>
> Sent: Thursday, December 6, 2018 1:54 PM
> To: Jason Gunthorpe <jgg@xxxxxxxx>
> Cc: Gal Pressman <galpress@xxxxxxxxxx>; Parav Pandit
> <parav@xxxxxxxxxxxx>; Alexander Matushevsky <matua@xxxxxxxxxx>;
> Yossi Leybovich <sleybo@xxxxxxxxxx>; linux-rdma@xxxxxxxxxxxxxxx; Tom
> Tucker <tom@xxxxxxxxxxxxxxxxxxxxx>
> Subject: Re: [PATCH rdma-next 11/13] RDMA/efa: Add EFA verbs
> implementation
> 
> On Thu, 2018-12-06 at 12:11 -0700, Jason Gunthorpe wrote:
> > On Thu, Dec 06, 2018 at 02:01:54PM -0500, Doug Ledford wrote:
> > > On Thu, 2018-12-06 at 10:49 -0700, Jason Gunthorpe wrote:
> > > > On Thu, Dec 06, 2018 at 10:47:33AM +0200, Gal Pressman wrote:
> > > >
> > > > > > > +static void efa_destroy_ah_id(struct efa_dev *dev, u8 *id) {
> > > > > > > +	u16 device_ah;
> > > > > > > +	u32 ref_count;
> > > > > > > +	int err;
> > > > > > > +
> > > > > > > +	mutex_lock(&dev->ah_list_lock);
> > > > > >
> > > > > > Create and destroy ah for kernel consumers cannot sleep.
> > > > > > Though I understand that currently you don't have kernel
> consumers.
> > > > > > This should be changed to spin lock along with other allocation to
> GFP_ATOMIC.
> > > > >
> > > > > Create/destroy AH must be communicated with our device (in the
> > > > > form of an admin command), this requires us to sleep regardless
> > > > > of the mutex/memory allocation (we shouldn't busy wait while
> waiting for the device to complete the command).
> > > > > None of the currently supported flows are done in an atomic
> > > > > context, we will have to figure that out when we add support for
> those.
> > > >
> > > > You don't get to make up your own rules. The function pointer this
> > > > is being assigned to required atomic support.
> > >
> > > The core code itself treats this as non-atomic for user space calls,
> > > and only atomic for kernel calls.  So this isn't making up their own
> > > rules, this is following *exactly* the same rules as the current
> > > core kernel code, they simply don't provide a kernel verbs API
> > > implementation, only a user space implementation.
> >
> > And what if tomorrow we decide, for some reason, to chuck a spinlock
> > or RCU critical region around the create_ah in uverbs? The function
> > pointer is defined to be atomic, and is atomic in *every normal
> > driver* so this should absolutely be OK.
> 
> Then we end up having to duplicate the three or so bugfix patches where we
> moved roce sgid resolution out of the critical path because it can sleep.  

I most likely didn't understand your and Jason comment here.
But want to make things clear, if I misunderstood.
create_ah for kernel consumer for RoCE is non-blocking.
Resolving dmac for user ah is taken out in rdma_create_user_ah() in blocking context.
API contact is simple, when creating ah using rdma_create_ah, caller must initialized all ah_attr fields. No partial initialization.

So that _rdma_create_ah() and providers through both path have uniform way to get the GID attributes from kernel and user.

There were cases in past where in CM path for RoCE route/destination mac resolution were done multiple times, all to my knowledge is made proper know, as a result kernel consumers always pass AH with ah_attr initialized properly.

> We
> end up making the code problems we have worse.  Better to redesign this to
> remove that atomic assumption.  Hence why I had them drop the stuff they
> had.  If we get to the point of having two link layers that can't comply with
> the atomic requirement, then I think it's time to look at changing the
> requirement.
> 
> > > Besides, they had something ugly in there and I told them to drop
> > > it, so you can blame me for that.
> >
> > Most likely a big wack of this driver should be using driver specific
> > uapi calls, not abusing the common API with proprietary stuff. Then
> > they don't have these API mismatch problems.
> >
> > But, that should come out of a careful think on how/if we want to
> > evolve the RDMA to support MPI accelerators with no kernel API.
> >
> > Jason
> 
> --
> Doug Ledford <dledford@xxxxxxxxxx>
>     GPG KeyID: B826A3330E572FDD
>     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux