RE: [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD deallocate

"Saleem, Shiraz" <shiraz.saleem@xxxxxxxxx> · Thu, 27 Aug 2020 23:30:00 +0000



> Subject: Re: [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD
> deallocate
> 
> On Thu, Aug 27, 2020 at 02:06:03AM +0000, Saleem, Shiraz wrote:
> > > Subject: Re: [PATCH rdma-next 01/10] RDMA: Restore ability to fail
> > > on PD deallocate
> > >
> > > On Wed, Aug 26, 2020 at 12:49:03AM +0000, Saleem, Shiraz wrote:
> > >
> > > > The API is quite confusing now. If drivers are not expected to
> > > > fail the destroy and there is no way to propagate the device
> > > > failures, then the return type should be a void.
> > >
> > > More or less, drivers can only return -EAGAIN with the idea that a
> > > future call during the close process will eventually succeed.
> > >
> > > Any permanent failure will trigger WARN_ON and a memory leak
> > >
> > > Maybe we should switch the return code to bool or something to be a
> > > little clearer that it is request to retry, not a failure?
> > >
> > There is no retry for kernel object destroy right? So not sure bool is making It
> clearer.
> 
> Right, kernel verbs users don't know how to deal with destroy failure and designed
> to ensure that destroy always success.
> 
> >
> > I am not very familiar with devx flows but doesn’t it bypass the ib verbs layer
> altogether?
> > i.e. mlx5_ib_dealloc_pd isn’t directly called in the devx flows no? So
> > why changes its return type and other provider destroy callbacks?
> 
> DevX itself indeed bypasses ib_core and as a standalone feature doesn't need
> any changes in destroys. The problem arises when ib_core object is created with
> ibv_create_XXX() and forwarded later to DevX context.
> 
> FW counts DevX accesses and elevates internal reference counters to ensure that
> user will get proper error if he tries to destroy in-use resources.
> 
> This error is returned to mlx5_ib_dealloc_pd() too if DevX is not cleaned. This call
> can be executed by user anytime, for example if he decided to skip DevX cleanup
> and the ib_core/mlx5_ib can't prevent call to mlx5_ib_dealloc_pd() at this stage.
> 
> The difference between mlx5 device from other providers that HW/FW guarantees
> full cleanup during file close.
> 
> >
> > But lets go down the path that we really need a return code in the destroy APIs
> to solve this problem.
> > For one I don’t see how we can say its meant exclusively for devx drivers to use
> for a fail.
> > Also can we really claim the API contract is that driver can fail a destroy given a
> future destroy will succeed?
> > Since the kernel destroy has no retry.
> 
> This is why we have special calls for kernel users with WARN_ON() and forced
> cleanup of ib_core resources.
> 
> >
> > Which then boils down do we just keep a simpler definition of the API contract --
> driver can just return whatever the true error code is?
> > i.e. if it wants a retry, use EAGAIN. If it has a non recoverable device error, then
> reset the device, clean up the resources but return ENOTRECOVERABLE.
> > ib_core can enable the retry logic for EAGAIN _only_.  For other error codes,
> Ib_core can trigger a warn_on or something to indicate permanent failure.
> 
> We can, but drivers should implement this EAGAIN/ENOTRECOVERABLE logic,
> this is why in initial phase we are returning always success.
> 
I don’t think Jason is entertaining this. So...