RE: [PATCH for-rc 1/7] IB/{hfi1, qib, rdmavt}: Do not depend on IB Verbs name for driver logging

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>-----Original Message-----
>From: Dalessandro, Dennis
>Sent: Friday, February 8, 2019 8:41 AM
>To: Jason Gunthorpe <jgg@xxxxxxxx>
>Cc: dledford@xxxxxxxxxx; linux-rdma@xxxxxxxxxxxxxxx; Ruhl, Michael J
><michael.j.ruhl@xxxxxxxxx>; Marciniszyn, Mike
><mike.marciniszyn@xxxxxxxxx>
>Subject: Re: [PATCH for-rc 1/7] IB/{hfi1, qib, rdmavt}: Do not depend on IB
>Verbs name for driver logging
>
>On 1/17/2019 4:04 PM, Jason Gunthorpe wrote:
>> On Thu, Jan 17, 2019 at 12:41:12PM -0800, Dennis Dalessandro wrote:
>>> From: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
>>>
>>> Commit 896de0090a85 ("RDMA/core: Use dev_name instead of
>>> ibdev->name"), changes the API for setting the IB device name.
>>>
>>> HFI, QIB, and RDMAVT make assumptions with regards to this API (i.e.
>>> as to when the device name string is available).  This assumption is
>>> no longer valid.
>>>
>>> Update the drivers to reflect the commit by storing the specific
>>> device name in device local information.  Create a calldown for the
>>> RVT driver to access the drivers name information.
>>>
>>> Rework the logging macros to use the best device name.
>>>
>>> Fixes: 11f0e89710eb ("IB/{hfi1, qib}: Fix a concurrency issue with device
>name in logging")
>>> Reviewed-by: Mike Marciniszyn <mike.marciniszyn@xxxxxxxxx>
>>> Reviewed-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
>>> Signed-off-by: Michael J. Ruhl <michael.j.ruhl@xxxxxxxxx>
>>> Signed-off-by: Dennis Dalessandro <dennis.dalessandro@xxxxxxxxx>
>>> ---
>>>   drivers/infiniband/hw/hfi1/affinity.c  |    6 +++--
>>>   drivers/infiniband/hw/hfi1/driver.c    |    8 +++++++
>>>   drivers/infiniband/hw/hfi1/hfi.h       |   22 ++++++++-----------
>>>   drivers/infiniband/hw/hfi1/init.c      |    2 +-
>>>   drivers/infiniband/hw/hfi1/verbs.c     |    1 +
>>>   drivers/infiniband/hw/qib/qib.h        |   12 +++++-----
>>>   drivers/infiniband/hw/qib/qib_driver.c |    8 +++++++
>>>   drivers/infiniband/hw/qib/qib_init.c   |   10 ++++++--
>>>   drivers/infiniband/hw/qib/qib_verbs.c  |    1 +
>>>   drivers/infiniband/sw/rdmavt/trace.h   |    6 +++--
>>>   drivers/infiniband/sw/rdmavt/vt.c      |    2 +-
>>>   drivers/infiniband/sw/rdmavt/vt.h      |   10 ++++++--
>>>   include/rdma/rdma_vt.h                 |   38 ++++++--------------------------
>>>   13 files changed, 64 insertions(+), 62 deletions(-)
>>
>> This is too ugly. Drivers should not be defining their own printing
>> macros, or dealing with this name stuff. They shouldn't be randomly
>> choosing to print to the PCI struct device vs the ib struct device and
>> all sorts of other inconsistent things we have.
>
>Why is it ugly? This isn't new. There is nothing random about what is
>chosen to print. The PCI is used before we have an IB dev. That doesn't
>change before/after this patch.
>
>Are you referring to inconsistent things in hfi1/rdmavt/qib or from an
>IB core point of view?
>
>> If drivers need to print prior to registration, then we need to copy
>> what netdev has been doing and provide our own printing functions
>> accepting a struct ib_device* that go through various gyrations to
>> compute a suitable identification string across the entire life cycle
>> - ie netdev will use a PCI BDF and other options if the ethX name is
>> not yet assigned.
>
>Which is what we do here, use the PCI BDF right? Did I miss something?
>Or are you saying it should be added to the core rather than the driver.
>I'd be OK with that as well.
>
>> I stopped short of doing this when I injected dev_name, because it
>> didn't seem necessary, but if you say hfi has a problem, then this is
>> how to fix it.
>>
>>> --- a/drivers/infiniband/hw/hfi1/init.c
>>> +++ b/drivers/infiniband/hw/hfi1/init.c
>>> @@ -1313,7 +1313,7 @@ void hfi1_free_devdata(struct hfi1_devdata *dd)
>>>   			"Could not allocate unit ID: error %d\n", -ret);
>>>   		goto bail;
>>>   	}
>>> -	rvt_set_ibdev_name(&dd->verbs_dev.rdi, "%s_%d", class_name(),
>dd->unit);
>>> +	kobject_set_name(&dd->kobj, "%s_%d", class_name(), dd->unit);
>>
>> Woah, this driver has a kobj and a struct ib_device in the same
>> structure? Betcha the refcounting is done wrong :(
>
>Been like that for quite a while. Mike has looked at this and I trust
>him when he says the refcounting is correct. Do you see something wrong
>here? If so please point it out.
>
>> This kobject is completely redundant with the kobject that is already
>> in the struct ib_device, you should delete it, not use it for more
>> stuff.
>
>The kobject in the ib_device is just that, for ib_device stuff. The one
>here is for hfi1 stuff. Why muddle them together? Is there an advantage
>to that?
>
>>>   	/*
>>>   	 * Initialize all locks for the device. This needs to be as early as
>>> diff --git a/drivers/infiniband/hw/hfi1/verbs.c
>b/drivers/infiniband/hw/hfi1/verbs.c
>>> index ec582d8..6546cba 100644
>>> --- a/drivers/infiniband/hw/hfi1/verbs.c
>>> +++ b/drivers/infiniband/hw/hfi1/verbs.c
>>> @@ -1680,6 +1680,7 @@ int hfi1_register_ib_device(struct hfi1_devdata
>*dd)
>>>   	 * Fill in rvt info object.
>>>   	 */
>>>   	dd->verbs_dev.rdi.driver_f.port_callback = hfi1_create_port_files;
>>> +	dd->verbs_dev.rdi.driver_f.get_device_name =
>hfi1_get_device_name;
>>>   	dd->verbs_dev.rdi.driver_f.get_pci_dev = get_pci_dev;
>>>   	dd->verbs_dev.rdi.driver_f.check_ah = hfi1_check_ah;
>>>   	dd->verbs_dev.rdi.driver_f.notify_new_ah = hfi1_notify_new_ah;
>>> diff --git a/drivers/infiniband/hw/qib/qib.h
>b/drivers/infiniband/hw/qib/qib.h
>>> index 83d2349..3681701 100644
>>> --- a/drivers/infiniband/hw/qib/qib.h
>>> +++ b/drivers/infiniband/hw/qib/qib.h
>>> @@ -1075,6 +1075,8 @@ struct qib_devdata {
>>>   	struct tasklet_struct error_tasklet;
>>>
>>>   	int assigned_node_id; /* NUMA node closest to HCA */
>>> +	/* device name */
>>> +	char *name;
>>>   };
>>
>> Don't store your own names :(
>
>Can you explain the issue here?
>
>>
>>> -/**
>>> - * rvt_set_ibdev_name - Craft an IB device name from client info
>>> - * @rdi: pointer to the client rvt_dev_info structure
>>> - * @name: client specific name
>>> - * @unit: client specific unit number.
>>> - */
>>> -static inline void rvt_set_ibdev_name(struct rvt_dev_info *rdi,
>>> -				      const char *fmt, const char *name,
>>> -				      const int unit)
>>> -{
>>> -	/*
>>> -	 * FIXME: rvt and its users want to touch the ibdev before
>>> -	 * registration and have things like the name work. We don't have the
>>> -	 * infrastructure in the core to support this directly today, hack it
>>> -	 * to work by setting the name manually here.
>>> -	 */
>>> -	dev_set_name(&rdi->ibdev.dev, fmt, name, unit);
>>> -	strlcpy(rdi->ibdev.name, dev_name(&rdi->ibdev.dev),
>IB_DEVICE_NAME_MAX);
>>
>> So why wasn't this hack good enough? What is the issue?
>
>You literally wrote FIXME, so Mike took that to mean he should fix it.

I have an alternate patch that uses:

1) a device name private to the rdmavt instance (for rdmavt logging) (adds the name as a char)
2) doesn't use the kobject from HFI
3) doesn't use the char in QIB
4) uses hardwired text rather than a string in the logging function.: i.e : "hfi1_%d", dd->unit.

Would it be useful to submit that?

M

>-Denny
>
>





[Index of Archives]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Photo]     [Yosemite News]     [Yosemite Photos]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux