在 2024/11/10 01:52, Lukas Wunner 写道:
On Fri, Nov 08, 2024 at 11:09:39AM +0800, Shuai Xue wrote:
--- a/drivers/pci/hotplug/pciehp_ctrl.c
+++ b/drivers/pci/hotplug/pciehp_ctrl.c
@@ -19,6 +19,7 @@
#include <linux/types.h>
#include <linux/pm_runtime.h>
#include <linux/pci.h>
+#include <ras/ras_event.h>
#include "pciehp.h"
Hm, why does the TRACE_EVENT() definition have to live in ras_event.h?
Why not, say, in pciehp.h?
IMHO, it is a type of RAS related event, so I add it in ras_event.h, similar to
other events like aer_event and memory_failure_event.
I could move it to pciehp.h, if the maintainers prefer that location.
@@ -245,6 +246,8 @@ void pciehp_handle_presence_or_link_change(struct controller *ctrl, u32 events)
if (events & PCI_EXP_SLTSTA_PDC)
ctrl_info(ctrl, "Slot(%s): Card not present\n",
slot_name(ctrl));
+ trace_pciehp_event(dev_name(&ctrl->pcie->port->dev),
+ slot_name(ctrl), ON_STATE, events);
pciehp_disable_slot(ctrl, SURPRISE_REMOVAL);
break;
default:
I'd suggest using pci_name() instead of dev_name() as it's a little shorter.
Will use pci_name() instead.
Passing ON_STATE here isn't always accurate because there's
"case BLINKINGOFF_STATE" with a fallthrough preceding the
above code block.
Yes, you are right, I missed the above fallthrough case.
Wouldn't it be more readable to just log the event that occured
as a string, e.g. "Surprise Removal" (and "Insertion" or "Hot Add"
for the other trace event you're introducing) instead of the state?
Otherwise you see "ON_STATE" in the log but that's actually the
*old* value so you have to mentally convert this to "previously ON,
so now must be transitioning to OFF".
I see your point. "Surprise Removal" or "Insertion" is indeed the exact state
transition. However, I am concerned that using a string might make it difficult
for user space tools like rasdaemon to parse.
How about adding a new enum for state transition? For example:
enum pciehp_trans_type {
PCIEHP_SAFE_REMOVAL,
PCIEHP_SURPRISE_REMOVAL,
PCIEHP_Hot_Add,
...
}
And define the state transition as a int type for tracepoint, then rasdaemon
can parse the value easily.
trace_pciehp_event(pci_name(&ctrl->pcie->port->dev),
slot_name(ctrl), PCIEHP_SAFE_REMOVAL, events);
And TP_printk with symbolic name of the state transition.
TRACE_EVENT(pciehp_event,
TP_PROTO(const char *port_name,
const char *slot,
const int trans_state),
TP_ARGS(port_name, slot, trans_state),
TP_STRUCT__entry(
__string( port_name, port_name )
__string( slot, slot )
__field( int, trans_state )
),
TP_fast_assign(
__assign_str(port_name, port_name);
__assign_str(slot, slot);
__entry->trans_state = trans_state;
),
TP_printk("%s slot:%s, state:%d, events:%d\n",
__get_str(port_name),
__get_str(slot),
__print_symbolic(__entry->trans_state, PCIEHP_SURPRISE_REMOVAL),
);
I'm fine with adding trace points to pciehp, I just want to make sure
we do it in a way that's easy to parse for admins.
Thank you for the positive feedback :)
Thanks,
Lukas
Best Regards,
Shuai