Re: [PATCH rdma-next 01/10] RDMA: Restore ability to fail on PD deallocate

Gal Pressman <galpress@xxxxxxxxxx> · Tue, 25 Aug 2020 17:04:14 +0300

On 25/08/2020 16:44, Jason Gunthorpe wrote:
> On Tue, Aug 25, 2020 at 04:32:57PM +0300, Gal Pressman wrote:
>>> For uverbs it will go into an infinite loop in
>>> uverbs_destroy_ufile_hw() if destroy doesn't eventually succeed.
>>
>> The code breaks the loop in such cases, why infinite loop?
> 
> Oh, that is a bug, it should WARN_ON when that happens, because the
> driver has triggered a permanent memory leak.

Well, a WARN_ON won't do much good if you're stuck in an infinite loop :), the
break is definitely needed there.

>>> For kernel it will trigger WARN_ON's and then a permanent memory leak.
>>>
>>>> I agree that drivers shouldn't fail destroy commands, but you know.. bugs/errors
>>>> happen (especially when dealing with hardware), and we have a way to propagate
>>>> them, why do it for only some of the drivers?
>>>
>>> There is no way to propogate them.
>>>
>>> All destroy must eventually succeed.
>>
>> There is no way to propagate them on process cleanup, but the destroy verbs have
>> a return code all the way back to libibverbs, which we can use for error
>> propagation.
> 
> It is sort of OK for a driver to fail during RDMA_REMOVE_DESTROY.
> 
> All other reason codes must eventually succeed.
> 
>> The cleanup flow can either ignore the return value, or we can add
>> another parameter that explicitly means the call shouldn't fail and all
>> allocated memory/state should be freed.
> 
> I don't really see the value to return the error code to userspace, it
> would require churning all the drivers and all the destroy functions
> to pass the existing reason in.
> 
> Since all the details of the FW failure reason are lost to some EINVAL
> (or already logged to dmesg) I don't see much point.

Right, as always, the error code would probably not contain much information,
but there's a big difference between returning error code X/Y vs returning
success instead of an error. To me that just feels wrong, at least in cases
where we can prevent that.

Won't argue about the churn, it's a lot of work for a "small" benefit in the
rare error cases. Can't we leave it up for the individual driver to decide
whether it chooses to support that or not? i.e. default behavior would be that
failures are not allowed, and a driver can opt to respect the extra parameter
(kinda like how DEVX "supports" failing)?

>>>>> If the chip fails a destroy when it should not then it has failed and
>>>>> should be disabled at PCI and reset, continuing to free anyhow.
>>>>
>>>> How do we reset the device when there are active apps using it?
>>>
>>> The zap stuff revokes the BAR mmaping, it triggerst device fatal to
>>> userspace and that is mostly it for userspace..
>>
>> Interesting, is there a reference driver that does that today?
> 
> I think both mlx drivers and hns do? See
> uverbs_user_mmap_disassociate()

Thanks!