On 1/17/2019 4:04 PM, Jason Gunthorpe wrote:
On Thu, Jan 17, 2019 at 12:41:12PM -0800, Dennis Dalessandro wrote:
From: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
Commit 896de0090a85 ("RDMA/core: Use dev_name instead of
ibdev->name"), changes the API for setting the IB device name.
HFI, QIB, and RDMAVT make assumptions with regards to this API (i.e.
as to when the device name string is available). This assumption is
no longer valid.
Update the drivers to reflect the commit by storing the specific
device name in device local information. Create a calldown for the
RVT driver to access the drivers name information.
Rework the logging macros to use the best device name.
Fixes: 11f0e89710eb ("IB/{hfi1, qib}: Fix a concurrency issue with device name in logging")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
Reviewed-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
---
drivers/infiniband/hw/hfi1/affinity.c | 6 +++--
drivers/infiniband/hw/hfi1/driver.c | 8 +++++++
drivers/infiniband/hw/hfi1/hfi.h | 22 ++++++++-----------
drivers/infiniband/hw/hfi1/init.c | 2 +-
drivers/infiniband/hw/hfi1/verbs.c | 1 +
drivers/infiniband/hw/qib/qib.h | 12 +++++-----
drivers/infiniband/hw/qib/qib_driver.c | 8 +++++++
drivers/infiniband/hw/qib/qib_init.c | 10 ++++++--
drivers/infiniband/hw/qib/qib_verbs.c | 1 +
drivers/infiniband/sw/rdmavt/trace.h | 6 +++--
drivers/infiniband/sw/rdmavt/vt.c | 2 +-
drivers/infiniband/sw/rdmavt/vt.h | 10 ++++++--
include/rdma/rdma_vt.h | 38 ++++++--------------------------
13 files changed, 64 insertions(+), 62 deletions(-)
This is too ugly. Drivers should not be defining their own printing
macros, or dealing with this name stuff. They shouldn't be randomly
choosing to print to the PCI struct device vs the ib struct device and
all sorts of other inconsistent things we have.
Why is it ugly? This isn't new. There is nothing random about what is
chosen to print. The PCI is used before we have an IB dev. That doesn't
change before/after this patch.
Are you referring to inconsistent things in hfi1/rdmavt/qib or from an
IB core point of view?
If drivers need to print prior to registration, then we need to copy
what netdev has been doing and provide our own printing functions
accepting a struct ib_device* that go through various gyrations to
compute a suitable identification string across the entire life cycle
- ie netdev will use a PCI BDF and other options if the ethX name is
not yet assigned.
Which is what we do here, use the PCI BDF right? Did I miss something?
Or are you saying it should be added to the core rather than the driver.
I'd be OK with that as well.
I stopped short of doing this when I injected dev_name, because it
didn't seem necessary, but if you say hfi has a problem, then this is
how to fix it.
--- a/drivers/infiniband/hw/hfi1/init.c
+++ b/drivers/infiniband/hw/hfi1/init.c
@@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
"Could not allocate unit ID: error %d\n", -ret);
goto bail;
}
- rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(), dd->unit);
+ kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit);
Woah, this driver has a kobj and a struct ib_device in the same
structure? Betcha the refcounting is done wrong :(
Been like that for quite a while. Mike has looked at this and I trust
him when he says the refcounting is correct. Do you see something wrong
here? If so please point it out.
This kobject is completely redundant with the kobject that is already
in the struct ib_device, you should delete it, not use it for more
stuff.
The kobject in the ib_device is just that, for ib_device stuff. The one
here is for hfi1 stuff. Why muddle them together? Is there an advantage
to that?
/*
* Initialize all locks for the device. This needs to be as early as
diff --git a/drivers/infiniband/hw/hfi1/verbs.c b/drivers/infiniband/hw/hfi1/verbs.c
index ec582d8..6546cba 100644
--- a/drivers/infiniband/hw/hfi1/verbs.c
+++ b/drivers/infiniband/hw/hfi1/verbs.c
@@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata *dd)
* Fill in rvt info object.
*/
dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files;
+ dd->verbs_dev.rdi.driver_f.get_device_name = hfi1_get_device_name;
dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev;
dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah;
dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah;
diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h
index 83d2349..3681701 100644
--- a/drivers/infiniband/hw/qib/qib.h
+++ b/drivers/infiniband/hw/qib/qib.h
@@ -1075,6 +1075,8 @@ struct qib_devdata {
struct tasklet_struct error_tasklet;
int assigned_node_id; /* NUMA node closest to HCA */
+ /* device name */
+ char *name;
};
Don't store your own names :(
Can you explain the issue here?
-/**
- * rvt_set_ibdev_name - Craft an IB device name from client info
- * @rdi: pointer to the client rvt_dev_info structure
- * @name: client specific name
- * @unit: client specific unit number.
- */
-static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi,
- const char *fmt, const char *name,
- const int unit)
-{
- /*
- * FIXME: rvt and its users want to touch the ibdev before
- * registration and have things like the name work. We don't have the
- * infrastructure in the core to support this directly today, hack it
- * to work by setting the name manually here.
- */
- dev_set_name(&rdi->ibdev.dev, fmt, name, unit);
- strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev), IB_DEVICE_NAME_MAX);
So why wasn't this hack good enough? What is the issue?
You literally wrote FIXME, so Mike took that to mean he should fix it.
-Denny