Re: [PATCH for-next] Revert "IB/mlx5: Don't return errors from poll_cq"

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Mar 04, 2022 at 10:53:34AM +0000, Haakon Bugge wrote:
> 
> 
> > On 3 Mar 2022, at 20:09, Leon Romanovsky <leon@xxxxxxxxxx> wrote:
> > 
> > On Thu, Mar 03, 2022 at 02:50:17PM +0100, Håkon Bugge wrote:
> >> This reverts commit dbdf7d4e7f911f79ceb08365a756bbf6eecac81c.
> >> 
> >> Commit dbdf7d4e7f91 ("IB/mlx5: Don't return errors from poll_cq") is
> >> needed, when driver/fw communication gets wedged.
> >> 
> >> With a large fleet of systems equipped with CX-5, we have observed the
> >> following mlx5 error message:
> >> 
> >> wait_func:945:(pid xxx): ACCESS_REG(0x805) timeout. Will cause a
> >> leak of a command resource
> > 
> > It is arguably FW issue. Please contact your Nvidia support representative.
> 
> The RC for the whacked driver/fw communication has been raised with Nvidia support. This commit is to avoid the kernel to crash when this situation arises. And inevitable, it may happen.

I'm confident that support team will find best possible solution to the
raised issue.

Thanks

> 
> 
> Thxs, Håkon



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux