Re: [PATCH rdma-rc] RDMA/mlx5: Clear old rate limit when closing QP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Oct 17, 2019 at 04:12:04PM -0400, Doug Ledford wrote:
> On Wed, 2019-10-02 at 15:02 +0300, Leon Romanovsky wrote:
> > From: Rafi Wiener <rafiw@xxxxxxxxxxxx>
> >
> > Before QP is closed it changes to ERROR state, when this happens
> > the QP was left with old rate limit that was already removed from
> > the table.
> >
> > Fixes: 7d29f349a4b9 ("IB/mlx5: Properly adjust rate limit on QP state
> > transitions")
> > Signed-off-by: Rafi Wiener <rafiw@xxxxxxxxxxxx>
> > Signed-off-by: Oleg Kuporosov <olegk@xxxxxxxxxxxx>
> > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx>
>
> If you are in the process of closing the queue pair, does this solve
> some sort of multi-close race, or is it just being pedantic before
> freeing the qp struct?

It fixes real bug with panic, I didn't add splat, because it had debug
info needed to find this problem.

The nutshell of this bug is how we are storing rate limits:
in one table of global mlx5_core_dev and struct (not pointer) of
mlx5_rate_limit inside mlx5_ib_qp. Such combination still allows
access to rate limit (old one) for ibqp, for example for compare
(mlx5_rl_are_equal).

The best solution is to rewrite rl logic to use pointers, but it was too
much to demand from Oleg and Rafi, who stepped on this bug with their
user space application.

Thanks

>
> I took it regardless, just curious.
>
> --
> Doug Ledford <dledford@xxxxxxxxxx>
>     GPG KeyID: B826A3330E572FDD
>     Fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux