On Fri, May 03, 2019 at 12:37:13PM +0300, Gal Pressman wrote: > On 02-May-19 16:55, Jason Gunthorpe wrote: > > On Wed, May 01, 2019 at 01:48:20PM +0300, Gal Pressman wrote: > > > >> +static struct efa_comp_ctx *efa_com_submit_admin_cmd(struct efa_com_admin_queue *aq, > >> + struct efa_admin_aq_entry *cmd, > >> + size_t cmd_size_in_bytes, > >> + struct efa_admin_acq_entry *comp, > >> + size_t comp_size_in_bytes) > >> +{ > >> + struct efa_comp_ctx *comp_ctx; > >> + > >> + spin_lock(&aq->sq.lock); > >> + if (!test_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state)) { > >> + ibdev_err(aq->efa_dev, "Admin queue is closed\n"); > >> + spin_unlock(&aq->sq.lock); > >> + return ERR_PTR(-ENODEV); > >> + } > >> + > >> + comp_ctx = __efa_com_submit_admin_cmd(aq, cmd, cmd_size_in_bytes, comp, > >> + comp_size_in_bytes); > >> + spin_unlock(&aq->sq.lock); > >> + if (IS_ERR(comp_ctx)) > >> + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state); > >> + > >> + return comp_ctx; > >> +} > > > > [..] > > > >> +static void efa_com_admin_flush(struct efa_com_dev *edev) > >> +{ > >> + struct efa_com_admin_queue *aq = &edev->aq; > >> + > >> + clear_bit(EFA_AQ_STATE_RUNNING_BIT, &aq->state); > > > > This scheme looks use after free racey to me.. > > The running bit stops new admin commands from being submitted, clearly the exact > moment in which the bit is cleared is "racy" to submission of admin commands but > that is taken care of. > > The submission of an admin command is atomic as it is guarded by an admin queue > lock. > The same lock is acquired by this flow as well when flushing the admin queue. > After all admin commands have been aborted and we know for sure that > no new The problem is the 'abort' does nothing to ensure parallel threads are no longer touching this memory, it just plays with locks in a racy way for the above. Jason