Re: [PATCH for-rc 1/7] IB/{hfi1, qib, rdmavt}: Do not depend on IB Verbs name for driver logging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Feb 08, 2019 at 08:41:28AM -0500, Dennis Dalessandro wrote:
> On 1/17/2019 4:04 PM, Jason Gunthorpe wrote:
> > On Thu, Jan 17, 2019 at 12:41:12PM -0800, Dennis Dalessandro wrote:
> > > From: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
> > > 
> > > Commit 896de0090a85 ("RDMA/core: Use dev_name instead of
> > > ibdev->name"), changes the API for setting the IB device name.
> > > 
> > > HFI, QIB, and RDMAVT make assumptions with regards to this API (i.e.
> > > as to when the device name string is available).  This assumption is
> > > no longer valid.
> > > 
> > > Update the drivers to reflect the commit by storing the specific
> > > device name in device local information.  Create a calldown for the
> > > RVT driver to access the drivers name information.
> > > 
> > > Rework the logging macros to use the best device name.
> > > 
> > > Fixes: 11f0e89710eb ("IB/{hfi1, qib}: Fix a concurrency issue with device name in logging")
> > > Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
> > > Reviewed-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
> > > Signed-off-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
> > > Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
> > >   drivers/infiniband/hw/hfi1/affinity.c  |    6 +++--
> > >   drivers/infiniband/hw/hfi1/driver.c    |    8 +++++++
> > >   drivers/infiniband/hw/hfi1/hfi.h       |   22 ++++++++-----------
> > >   drivers/infiniband/hw/hfi1/init.c      |    2 +-
> > >   drivers/infiniband/hw/hfi1/verbs.c     |    1 +
> > >   drivers/infiniband/hw/qib/qib.h        |   12 +++++-----
> > >   drivers/infiniband/hw/qib/qib_driver.c |    8 +++++++
> > >   drivers/infiniband/hw/qib/qib_init.c   |   10 ++++++--
> > >   drivers/infiniband/hw/qib/qib_verbs.c  |    1 +
> > >   drivers/infiniband/sw/rdmavt/trace.h   |    6 +++--
> > >   drivers/infiniband/sw/rdmavt/vt.c      |    2 +-
> > >   drivers/infiniband/sw/rdmavt/vt.h      |   10 ++++++--
> > >   include/rdma/rdma_vt.h                 |   38 ++++++--------------------------
> > >   13 files changed, 64 insertions(+), 62 deletions(-)
> > 
> > This is too ugly. Drivers should not be defining their own printing
> > macros, or dealing with this name stuff. They shouldn't be randomly
> > choosing to print to the PCI struct device vs the ib struct device and
> > all sorts of other inconsistent things we have.
> 
> Why is it ugly? This isn't new. There is nothing random about what is chosen
> to print. The PCI is used before we have an IB dev. That doesn't change
> before/after this patch.

How will this private name get updated when device rename happens?

> Are you referring to inconsistent things in hfi1/rdmavt/qib or from an IB
> core point of view?

Yes, it is inconsistent from an IB core point of view.

> > If drivers need to print prior to registration, then we need to copy
> > what netdev has been doing and provide our own printing functions
> > accepting a struct ib_device* that go through various gyrations to
> > compute a suitable identification string across the entire life cycle
> > - ie netdev will use a PCI BDF and other options if the ethX name is
> > not yet assigned.
> 
> Which is what we do here, use the PCI BDF right? Did I miss something? Or
> are you saying it should be added to the core rather than the driver. I'd be
> OK with that as well.

Yes

> > I stopped short of doing this when I injected dev_name, because it
> > didn't seem necessary, but if you say hfi has a problem, then this is
> > how to fix it.
> > 
> > > +++ b/drivers/infiniband/hw/hfi1/init.c
> > > @@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
> > >   			"Could not allocate unit ID: error %d\n", -ret);
> > >   		goto bail;
> > >   	}
> > > -	rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit);
> > > +	kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit);
> > 
> > Woah, this driver has a kobj and a struct ib_device in the same
> > structure? Betcha the refcounting is done wrong :(
> 
> Been like that for quite a while. Mike has looked at this and I trust him
> when he says the refcounting is correct. Do you see something wrong here? If
> so please point it out.
> 
> > This kobject is completely redundant with the kobject that is already
> > in the struct ib_device, you should delete it, not use it for more
> > stuff.
> 
> The kobject in the ib_device is just that, for ib_device stuff. The one here
> is for hfi1 stuff. Why muddle them together? Is there an advantage to that?

The kobject literally *doesn't do anything*

On kobject/kref per memory allocation except in bizzaro situations.

> > >   	/*
> > >   	 * Initialize all locks for the device. This needs to be as early as
> > > diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
> > > index ec582d8..6546cba 100644
> > > +++ b/drivers/infiniband/hw/hfi1/verbs.c
> > > @@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
> > >   	 * Fill in rvt info object.
> > >   	 */
> > >   	dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files;
> > > +	dd->verbs_dev.rdi.driver_f.get_device_name = hfi1_get_device_name;
> > >   	dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev;
> > >   	dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah;
> > >   	dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah;
> > > diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h
> > > index 83d2349..3681701 100644
> > > +++ b/drivers/infiniband/hw/qib/qib.h
> > > @@ -1075,6 +1075,8 @@ struct qib_devdata {
> > >   	struct tasklet_struct error_tasklet;
> > >   	int assigned_node_id; /* NUMA node closest to HCA */
> > > +	/* device name */
> > > +	char *name;
> > >   };
> > 
> > Don't store your own names :(
> 
> Can you explain the issue here?

How will rename work?

> > 
> > > -/**
> > > - * rvt_set_ibdev_name - Craft an IB device name from client info
> > > - * @rdi: pointer to the client rvt_dev_info structure
> > > - * @name: client specific name
> > > - * @unit: client specific unit number.
> > > - */
> > > -static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi,
> > > -				      const char *fmt, const char *name,
> > > -				      const int unit)
> > > -{
> > > -	/*
> > > -	 * FIXME: rvt and its users want to touch the ibdev before
> > > -	 * registration and have things like the name work. We don't have the
> > > -	 * infrastructure in the core to support this directly today, hack it
> > > -	 * to work by setting the name manually here.
> > > -	 */
> > > -	dev_set_name(&rdi->ibdev.dev, fmt, name, unit);
> > > -	strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev), IB_DEVICE_NAME_MAX);
> > 
> > So why wasn't this hack good enough? What is the issue?
> 
> You literally wrote FIXME, so Mike took that to mean he should fix it.

Yes, the comment says "we don't have the infrastructure in the core
to support this" so the implication is to add more infrastructure in
the core code to support an early name assignment, or more
infrastructure to avoid this need in the driver (ie logging helpers
like netdev).

Not more ugly driver hackery.

Jason



[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux