Re: [RFC PATCH] PCI: pciehp: Generate a RAS tracepoint for hotplug event

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





在 2024/11/10 01:52, Lukas Wunner 写道:
On Fri, Nov 08, 2024 at 11:09:39AM +0800, Shuai Xue wrote:
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -19,6 +19,7 @@
  #include <linux/types.h>
  #include <linux/pm_runtime.h>
  #include <linux/pci.h>
+#include <ras/ras_event.h>
  #include "pciehp.h"

Hm, why does the TRACE_EVENT() definition have to live in ras_event.h?
Why not, say, in pciehp.h?

IMHO, it is a type of RAS related event, so I add it in ras_event.h, similar to
other events like aer_event and memory_failure_event.

I could move it to pciehp.h, if the maintainers prefer that location.


@@ -245,6 +246,8 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
  		if (events & PCI_EXP_SLTSTA_PDC)
  			ctrl_info(ctrl, "Slot(%s): Card not present\n",
  				  slot_name(ctrl));
+		trace_pciehp_event(dev_name(&ctrl->pcie->port->dev),
+				   slot_name(ctrl), ON_STATE, events);
  		pciehp_disable_slot(ctrl, SURPRISE_REMOVAL);
  		break;
  	default:

I'd suggest using pci_name() instead of dev_name() as it's a little shorter.

Will use pci_name() instead.


Passing ON_STATE here isn't always accurate because there's
"case BLINKINGOFF_STATE" with a fallthrough preceding the
above code block.

Yes, you are right, I missed the above fallthrough case.


Wouldn't it be more readable to just log the event that occured
as a string, e.g. "Surprise Removal" (and "Insertion" or "Hot Add"
for the other trace event you're introducing) instead of the state?

Otherwise you see "ON_STATE" in the log but that's actually the
*old* value so you have to mentally convert this to "previously ON,
so now must be transitioning to OFF".

I see your point. "Surprise Removal" or "Insertion" is indeed the exact state
transition. However, I am concerned that using a string might make it difficult
for user space tools like rasdaemon to parse.

How about adding a new enum for state transition? For example:

	enum pciehp_trans_type {
		PCIEHP_SAFE_REMOVAL,
		PCIEHP_SURPRISE_REMOVAL,
		PCIEHP_Hot_Add,
	...
	}

And define the state transition as a int type for tracepoint, then rasdaemon
can parse the value easily.

	trace_pciehp_event(pci_name(&ctrl->pcie->port->dev),
		slot_name(ctrl), PCIEHP_SAFE_REMOVAL, events);

And TP_printk with symbolic name of the state transition.

	TRACE_EVENT(pciehp_event,
		TP_PROTO(const char *port_name,
			 const char *slot,
			 const int trans_state),
	
		TP_ARGS(port_name, slot, trans_state),
	
		TP_STRUCT__entry(
			__string(	port_name,	port_name	)
			__string(	slot,		slot		)
			__field(	int,		trans_state	)
		),
	
		TP_fast_assign(
			__assign_str(port_name, port_name);
			__assign_str(slot, slot);
			__entry->trans_state	= trans_state;
		),
	
		TP_printk("%s slot:%s, state:%d, events:%d\n",
			__get_str(port_name),
			__get_str(slot),
			__print_symbolic(__entry->trans_state, PCIEHP_SURPRISE_REMOVAL),
	);


I'm fine with adding trace points to pciehp, I just want to make sure
we do it in a way that's easy to parse for admins.

Thank you for the positive feedback :)


Thanks,

Lukas

Best Regards,
Shuai




[Index of Archives]     [DMA Engine]     [Linux Coverity]     [Linux USB]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [Greybus]

  Powered by Linux