On Wed, Feb 27, 2019 at 01:17:51PM +0000, Bernard Metzler wrote: > >> + sdev->attrs.max_qp = SIW_MAX_QP; > >> + sdev->attrs.max_qp_wr = SIW_MAX_QP_WR; > >> + sdev->attrs.max_ord = SIW_MAX_ORD_QP; > >> + sdev->attrs.max_ird = SIW_MAX_IRD_QP; > >> + sdev->attrs.max_sge = SIW_MAX_SGE; > >> + sdev->attrs.max_sge_rd = SIW_MAX_SGE_RD; > >> + sdev->attrs.max_cq = SIW_MAX_CQ; > >> + sdev->attrs.max_cqe = SIW_MAX_CQE; > >> + sdev->attrs.max_mr = SIW_MAX_MR; > >> + sdev->attrs.max_pd = SIW_MAX_PD; > >> + sdev->attrs.max_mw = SIW_MAX_MW; > >> + sdev->attrs.max_fmr = SIW_MAX_FMR; > >> + sdev->attrs.max_srq = SIW_MAX_SRQ; > >> + sdev->attrs.max_srq_wr = SIW_MAX_SRQ_WR; > >> + sdev->attrs.max_srq_sge = SIW_MAX_SGE; > >> + > >> + siw_idr_init(sdev); > > > >Why idr_init/release functions? Seems unnecessary > > idr structs are allocated for each dynamically allocated > siw device (one device per link). So each device has its own > id space (e.g. for STags/Keys). So I think I need to > initialize/destroy the idr resources dynamically as well. I mean why the function siw_idr_init called from exactly one place that has a couple of lines in it? > >> +static void siw_device_goes_down(struct siw_device *sdev) > >> +{ > >> + if (ib_device_try_get(&sdev->base_dev)) { > >> + INIT_WORK(&sdev->netdev_down, siw_netdev_down); > >> + schedule_work(&sdev->netdev_down); > > > >.. ie here it should be the system unbound wq, but that would > >deadlock > >with ib_unregister_device_queued .. > > > > Hmm, can you please help me understanding that > concern? I agree I should check if schedule_work() > succeeds (I have to fix that, but it should never happen), > but otherwise? > How can it deadlock? The work queue could be busy with work that is waiting for the refcount to go to 0 and not executing the work that takes the refcount to zero. > >> +static int siw_netdev_event(struct notifier_block *nb, unsigned > >long event, > >> + void *arg) > >> +{ > >> + struct net_device *netdev = netdev_notifier_info_to_dev(arg); > >> + struct ib_device *base_dev; > >> + struct siw_device *sdev; > >> + > >> + dev_dbg(&netdev->dev, "siw: event %lu\n", event); > >> + > >> + if (dev_net(netdev) != &init_net) > >> + return NOTIFY_OK; > >> + > >> + base_dev = ib_device_get_by_netdev(netdev, RDMA_DRIVER_SIW); > >> + if (!base_dev) > >> + return NOTIFY_OK; > >> + > >> + sdev = to_siw_dev(base_dev); > >> + > >> + switch (event) { > >> + > >> + case NETDEV_UP: > >> + sdev->state = IB_PORT_ACTIVE; > >> + siw_port_event(sdev, 1, IB_EVENT_PORT_ACTIVE); > >> + break; > >> + > >> + case NETDEV_GOING_DOWN: > >> + siw_device_goes_down(sdev); > >> + break; > > > >This probably wants to be return as there is a double put > >otherwise. Never tested? > > > I tested bringing down interfaces, also while there is siw caused > traffic on it. > siw_device_goes_down() calls ib_device_try_get(), which > schedules work only if it got that additional reference > (which then get's put in the work handler). So I don't see a > potential double put case (yet)...? It is not such a great idea to call ib_device_try_get when you already have a get, as in that case it cannot fail and things just get confusing. Jason