Re: [v5 01/12] struct device: Add function callback durable_name

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 9/29/20 1:04 PM, Greg Kroah-Hartman wrote:
> On Tue, Sep 29, 2020 at 06:51:02PM +0100, Christoph Hellwig wrote:
>> Independ of my opinion on the whole scheme that I shared last time,
>> we really should not bloat struct device with function pointers.

Christoph, thank you for sharing a bit more of your concerns and
bringing Greg into the discussion.  It's more productive.

>>
>> On Fri, Sep 25, 2020 at 11:19:18AM -0500, Tony Asleson wrote:
>>> Function callback and function to be used to write a persistent
>>> durable name to the supplied character buffer.  This will be used to add
>>> structured key-value data to log messages for hardware related errors
>>> which allows end users to correlate message and specific hardware.
>>>
>>> Signed-off-by: Tony Asleson <tasleson@xxxxxxxxxx>
>>> ---
>>>  drivers/base/core.c    | 24 ++++++++++++++++++++++++
>>>  include/linux/device.h |  4 ++++
>>>  2 files changed, 28 insertions(+)
> 
> I can't find this patch anywhere in my archives, why was I not cc:ed on
> it?  It's a v5 and no one thought to ask the driver core
> developers/maintainers about it???

You were CC'd into v3, ref.

https://lore.kernel.org/linux-ide/20200714081750.GB862637@xxxxxxxxx/

I should have continued to CC you on it, sorry about that.


> {sigh}
> 
> And for log messages, what about the dynamic debug developers, why not
> include them as well?  Since when is this a storage-only thing?

Hannes Reinecke has been involved in the discussion some and he's
involved in dynamic debug AFAIK.

If others have a need to identify a specific piece of hardware in a
potential sea of identical hardware that is encountering errors and
logging messages and can optionally be added and removed at run-time,
then yes they should be included too.  There is nothing with this patch
series that is preventing any device from registering a callback which
provides this information when asked.


>>> diff --git a/include/linux/device.h b/include/linux/device.h
>>> index 5efed864b387..074125999dd8 100644
>>> --- a/include/linux/device.h
>>> +++ b/include/linux/device.h
>>> @@ -614,6 +614,8 @@ struct device {
>>>  	struct iommu_group	*iommu_group;
>>>  	struct dev_iommu	*iommu;
>>>  
>>> +	int (*durable_name)(const struct device *dev, char *buff, size_t len);
> 
> No, that's not ok at all, why is this even a thing?
> 
> Who is setting this?  Why can't the bus do this without anything
> "special" needed from the driver core?

I'm setting it in the different storage subsystems.  The intent is that
when we go through a common logging path eg. dev_printk you can ask the
device what it's ID is so that it can be logged as structured data with
the log message.  I was trying to avoid having separate logging
functions all over the place which enforces a consistent meta data to
log messages, but maybe that would be better than adding a function
pointer to struct device?  My first patch tried adding a call back to
struct device_type, but that ran into the issue where struct device_type
is static const in a number of areas.

Basically for any piece of hardware with a serial number, call this
function to get it.  That was the intent anyway.

> We have a mapping of 'struct device' to a unique hardware device at a
> specific point in time, why are you trying to create another one?

I don't think I'm creating anything new.  Can you clarify this a bit
more?  I'm trying to leverage what is already in place.

> What is wrong with what we have today?

I'm trying to figure out a way to positively identify which storage
device an error belongs to over time.  Logging how something is attached
and not what is attached, only solves for right now, this point in time.
 Additionally when the only identifier is in the actual message itself,
it makes user space break every time the message content changes.  Also
what we have today in dev_printk doesn't tag the meta-data consistently
for journald to leverage.

This patch series adds structured data to the log entry for positive
identification.  It doesn't change when the content of the log message
changes.  It's true over time and it's true if a device gets moved to a
different system.

> So this is a HARD NAK on this patch for now.

Thank you for supplying some feedback and asking questions.  I've been
asking for suggestions and would very much like to have a discussion on
how this issue is best solved.  I'm not attached to what I've provided.
I'm just trying to get towards a solution.


We've looked at user space quite a bit and there is an inherit race
condition with trying to fetch the unique hardware id for a message when
it gets emitted from the kernel as udev rules haven't even run (assuming
we even have the meta-data to make the association).  The last thing
people want to do is delay writing the log message to disk until the
device it belongs to can be identified.  Of course this patch series
still has a window from where the device is first discovered by the
kernel and fetches the needed vpd data from the device.  Any kernel
messages logged during that time have no id to associate with it.

Thanks,
Tony





[Index of Archives]     [Linux Filesystems]     [Linux SCSI]     [Linux RAID]     [Git]     [Kernel Newbies]     [Linux Newbie]     [Security]     [Netfilter]     [Bugtraq]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Samba]     [Device Mapper]

  Powered by Linux