On Wed, Sep 02, 2020 at 09:31:15PM -0300, Jason Gunthorpe wrote: > On Sun, Aug 30, 2020 at 11:40:03AM +0300, Leon Romanovsky wrote: > > From: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > > > The HW release can fail and leave the system in limbo state, > > where SRQ is removed from the table, but can't be destroyed later. > > In every reentry, the initial xa_erase_irq() check will fail. > > > > Rewrite the erase logic to keep index, but don't store the entry > > itself. By doing it, we can safely reinsert entry back in the case > > of destroy failure and be safe from any xa_store_irq() error. > > > > Signed-off-by: Leon Romanovsky <leonro@xxxxxxxxxxxx> > > drivers/infiniband/hw/mlx5/srq_cmd.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/infiniband/hw/mlx5/srq_cmd.c b/drivers/infiniband/hw/mlx5/srq_cmd.c > > index 37aaacebd3f2..c6d807f04d9d 100644 > > +++ b/drivers/infiniband/hw/mlx5/srq_cmd.c > > @@ -596,13 +596,22 @@ void mlx5_cmd_destroy_srq(struct mlx5_ib_dev *dev, struct mlx5_core_srq *srq) > > struct mlx5_core_srq *tmp; > > int err; > > > > - tmp = xa_erase_irq(&table->array, srq->srqn); > > - if (!tmp || tmp != srq) > > + /* Delete entry, but leave index occupied */ > > + tmp = xa_store_irq(&table->array, srq->srqn, NULL, 0); > > + if (WARN_ON(!tmp || tmp != srq)) > > return; > > This isn't an allocating xarray: > > xa_init_flags(&table->array, XA_FLAGS_LOCK_IRQ); > > So storing NULL actually does delete the entry and clean up the memory > and the store below could fail. > > I think this should be written as > > xa_cmpxchg_irq(&table->array, srq->srqn, srq, XA_ZERO_ENTRY, 0); > > And the undo below would be > > xa_cmpxchg_irq(&table->array, srq->srqn, XA_ZERO_ENTRY, srq 0); Thanks, it is better. > > > + xa_erase_irq(&table->array, srq->srqn); > > And this is racy since the FW could have reallocated the same srqn and > already set it in the xarray. Not in mlx5 devices, srqn is allocated in round-robin order. > > It needs to be xa_release_irq(), which looks like it needs to be > added to match xa_reserve_irq() We can't call to xa_lock_irq*() while issuing FW commands. Thanks > > Jason