Re: [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD deallocate

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Aug 27, 2020 at 11:29:44PM +0000, Saleem, Shiraz wrote:
> > Subject: Re: [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD
> > deallocate
> > 
> > On Thu, Aug 27, 2020 at 02:06:03AM +0000, Saleem, Shiraz wrote:
> > 
> > > Which then boils down do we just keep a simpler definition of the API
> > > contract -- driver can just return whatever the true error code is?
> > 
> > No, that was always wrong. In almost every case returning codes from destroy is a
> > driver bug, flat out. It causes kernel leaking memory/worse and unrecoverable
> > userspace failures.
> > 
> seems like we are opening a can then.

It is not something new, it has always been like this, with these
rules. The effort to remove the return codes simply failed :(

> I can see a new provider seeing the int return type and returning error codes.
> And maybe being stumped by seeing some providers ignoring device errors and faking a success.
> And one provider returning error codes.

No, things can't ignore device failures. If the provider can't
shutdown a rouge device then it must return error, leak memory and
accept the WARN_ON. Otherwise the device will cause memory corruption
by DMA'ing to memory that has been freed. 

Having a RDMA driver that can do recovery from HW errors via device
reset is really required to close these edge cases.

I suspect no RDMA driver gets this all right today.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux