Re: [PATCH v4 03/13] SIW network and RDMA core interface

"Bernard Metzler" <BMT@xxxxxxxxxxxxxx> · Tue, 12 Feb 2019 15:58:54 +0000

-----"Jason Gunthorpe" <jgg@xxxxxxxx> wrote: -----

>To: "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>
>From: "Jason Gunthorpe" <jgg@xxxxxxxx>
>Date: 02/11/2019 07:03PM
>Cc: linux-rdma@xxxxxxxxxxxxxxx
>Subject: Re: [PATCH v4 03/13] SIW network and RDMA core interface
>
>On Mon, Feb 11, 2019 at 04:20:53PM +0000, Bernard Metzler wrote:
>> 
>> >To: "Bernard Metzler" <BMT@xxxxxxxxxxxxxx>
>> >From: "Jason Gunthorpe" <jgg@xxxxxxxx>
>> >Date: 01/31/2019 06:22PM
>> >Cc: linux-rdma@xxxxxxxxxxxxxxx
>> >Subject: Re: [PATCH v4 03/13] SIW network and RDMA core interface
>> >
>> >On Thu, Jan 31, 2019 at 03:24:01PM +0000, Bernard Metzler wrote:
>> >
>> >> >> +static void siw_netdev_unregistered(struct work_struct
>*work)
>> >> >> +{
>> >> >> +	struct siw_device *sdev = container_of(work, struct
>> >siw_device,
>> >> >> +					       netdev_unregister);
>> >> >> +
>> >> >> +	struct siw_qp_attrs qp_attrs;
>> >> >> +	struct list_head *pos, *tmp;
>> >> >> +
>> >> >> +	memset(&qp_attrs, 0, sizeof(qp_attrs));
>> >> >> +	qp_attrs.state = SIW_QP_STATE_ERROR;
>> >> >> +
>> >> >> +	/*
>> >> >> +	 * Mark all current QP's of this device dead
>> >> >> +	 */
>> >> >> +	list_for_each_safe(pos, tmp, &sdev->qp_list) {
>> >> >> +		struct siw_qp *qp = list_entry(pos, struct siw_qp, devq);
>> >> >> +
>> >> >> +		down_write(&qp->state_lock);
>> >> >> +		(void) siw_qp_modify(qp, &qp_attrs, SIW_QP_ATTR_STATE);
>> >> >> +		up_write(&qp->state_lock);
>> >> >> +	}
>> >> >
>> >> >Don't put code around ib_unregister. If you need to do things
>use
>> >a
>> >> >callback at the right moment in the unregister flow. You might
>> >need
>> >> >to
>> >> >add a callback (see my latest series)
>> >> 
>> >> I thought by having that executed in a work queue context only
>> >> we'd be safe. I have to bring down all related QP's if a network
>> >device
>> >> suddenly disappears... Looks like I have to better understand
>the
>> >> intended unregister flow. 
>> >
>> >If it is important it should be done during module unload and link
>> >removal, if it is not important, it shouldn't be done :)
>> 
>> If siw catches a NETDEV_UNREGISTER event, it shall first
>> close all related connection endpoints before the ib_device
>> can go away.
>
>For things to work properly you can't do any steps outside the
>unregister and its callbacks.
>
>NETDEV_UNREGISTER events should only call
>ib_unregsiter_device_queued() and nothing more.
>
>> Following your new device management design, I would now
>> first schedule local work to cleanup those siw private
>> resources, and later ib_unregister_device_queued(),
>
>You should not schedule work.
>
>Drivers should be designed so that when all PD's are closed most of
>their resources are gone as well. The core code will ensure all PD's
>are closed.
>
The problem probably is, that PD's are not going away as
long as the application sits on them. What I want is the provider
to break affected connections, if a link goes down and let
the application know about it.
To exemplify this - if I have a netdev which wants to unregister
and a connection using it - it cannot unregister since the rdma core
does not free up affected resources as long as referenced (by the
application). e.g., I see things like
    unregister_netdevice: waiting for enp1s0f4 to become free. Usage count = 1

What I just tried is to use the preceding NETDEV_GOING_DOWN event to mark
my affected QP's dead. That seem to work well. 

>If there are internal driver PD/etcs then we should add a callback to
>clean them up right after the clients are cleaned up.
>
>> or is it OK to do that local cleanup stuff within the 
>> dealloc_driver() method?
>
>dealloc_driver should free memory, and do any final fences against
>async work. It shouldn't spawn new work. 
>
>It is an appropriate place to free socket resources and other related
>things.
>
>Jason
>
>
Thanks
Bernard.