Re: [PATCH rdma-next] IB/sa: Give caller control of aliasguid sa query buffer lifetime

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, Mar 07, 2019 at 05:40:50AM -0800, Ira Weiny wrote:
> On Thu, Mar 07, 2019 at 07:18:11PM +0000, Jason Gunthorpe wrote:
> > On Thu, Mar 07, 2019 at 01:53:44AM -0800, Ira Weiny wrote:
> > > > @@ -440,11 +441,17 @@ static void aliasguid_query_handler(int status,
> > > >  				   alias_guid_work,
> > > >  				   msecs_to_jiffies(resched_delay_sec * 1000));
> > > >  	}
> > > > -	if (cb_ctx->sa_query) {
> > > > -		list_del(&cb_ctx->list);
> > > > +	if (!list_empty(&cb_ctx->list)) {
> > > > +		/* we own the callback context */
> > > > +		list_del_init(&cb_ctx->list);
> > > >  		kfree(cb_ctx);
> > > 
> > > Doesn't aliasguid_query_handler() get called from send_handler()?
> > > Which means that this kfree() will delete the query before
> > > send_handler is done with it?
> > 
> > Yep, send_handler can't touch query after calling callback in this
> > mode.
> > 
> > Probably the best solution is to have a ib_sa_destory_query() that
> > does the free_mad/etc, and do something like this:
> > 
> > diff --git a/drivers/infiniband/core/sa_query.c b/drivers/infiniband/core/sa_query.c
> > index 6c221ec919640d..b563e1ffd06772 100644
> > +++ b/drivers/infiniband/core/sa_query.c
> > @@ -2150,6 +2150,7 @@ static void send_handler(struct ib_mad_agent *agent,
> >  {
> >         struct ib_sa_query *query = mad_send_wc->send_buf->context[0];
> >         unsigned long flags;
> > +       bool callback_owns_memory = !query->release;
> >  
> >         if (query->callback)
> >                 switch (mad_send_wc->status) {
> > @@ -2167,6 +2168,10 @@ static void send_handler(struct ib_mad_agent *agent,
> >                         break;
> >                 }
> >  
> > +       /* The callback must call ib_sa_destroy() */
> > +       if (callback_owns_memory)
> > +               return;
> > +
> >         spin_lock_irqsave(&idr_lock, flags);
> >         idr_remove(&query_idr, query->id);
> 
> What about this idr_remove and putting the client reference?  I think it is a
> mistake to rely on the callers for this.

The caller should do 'destroy' when it is done with the SA struct
which does all this stuff.

> >         spin_unlock_irqrestore(&idr_lock, flags);
> > @@ -2174,8 +2179,7 @@ static void send_handler(struct ib_mad_agent *agent,
> >         free_mad(query);
> >         if (query->client)
> >                 ib_sa_client_put(query->client);
> > -       if (query->release)
> > -               query->release(query);
> > +       query->release(query);
> >  }
> >  
> >  static void recv_handler(struct ib_mad_agent *mad_agent,
> > 
> > > This whole patch seems like an anti-patern from the rest of the SA query
> > > interface.  What if ib_sa_guidinfo_rec_callback() took a reference to the
> > > sa_query before it returns it and allow the free to happen when that reference
> > > is put from the driver?
> > 
> > The entire sa query interface is a giant anti-pattern with bugs. This
> > approach the correct way to do things like this.
> 
> I agree it has not kept up with the times.  However, the sa_query object was an
> internal bookkeeping object for the saquery interface.  The patch for the guid
> info query apparently changed that.

The API never worked, it creates races and bugs to work like this.

> The saquery stuff was, in my interpretation, a "fire and wait for your
> callback" interface.  

Which is the problem, this makes cancel not work right, and cancel
must work for any real kernel user.

> When the callback came back then you took your data or the error at
> that time.  I've not kept up with this GUID info query but why does
> it need to have internal knowledge of the sa_query bookkeeping
> anyway?

It needs cancel to work more than the other users do.

All users should use this alternative API and that ugly idr thing
removed, but Jack is just doing guid alias, at least to start.

Jason




[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux