Re: [PATCH] selftest: drivers: Add support to check duplicate hwirq

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 





On 2024/10/19 3:34 AM, Bjorn Helgaas wrote:
On Tue, Sep 03, 2024 at 06:44:26PM -0700, Joseph Jang wrote:
Validate there are no duplicate hwirq from the irq debug
file system /sys/kernel/debug/irq/irqs/* per chip name.

One example log show 2 duplicated hwirq in the irq debug
file system.

$ sudo cat /sys/kernel/debug/irq/irqs/163
handler:  handle_fasteoi_irq
device:   0019:00:00.0
      <SNIP>
node:     1
affinity: 72-143
effectiv: 76
domain:  irqchip@0x0000100022040000-3
  hwirq:   0xc8000000
  chip:    ITS-MSI
   flags:   0x20

$ sudo cat /sys/kernel/debug/irq/irqs/174
handler:  handle_fasteoi_irq
device:   0039:00:00.0
     <SNIP>
node:     3
affinity: 216-287
effectiv: 221
domain:  irqchip@0x0000300022040000-3
  hwirq:   0xc8000000
  chip:    ITS-MSI
   flags:   0x20

The irq-check.sh can help to collect hwirq and chip name from
/sys/kernel/debug/irq/irqs/* and print error log when find duplicate
hwirq per chip name.

Kernel patch ("PCI/MSI: Fix MSI hwirq truncation") [1] fix above issue.
[1]: https://lore.kernel.org/all/20240115135649.708536-1-vidyas@xxxxxxxxxx/

I don't know enough about this issue to understand the details.  It
seems like you look for duplicate hwirqs in chips with the same name,
e.g., "ITS-MSI" in this case?  That name seems too generic to me
(might there be several instances of "ITS-MSI" in a system?)


As I know, each PCIe device typically has only one ITS-MSI controller.
Having multiple ITS-MSI instances for the same device would lead to confusion and potential conflicts in interrupt routing.

Also, the name may come from chip->irq_print_chip(), so it apparently
relies on irqchip drivers to make the names unique if there are
multiple instances?

I would have expected looking for duplicates inside something more
specific, like "irqchip@0x0000300022040000-3".  But again, I don't
know enough about the problem to speak confidently here.


In our case, If we look for duplicates by different irq domains like
"irqchip@0x0000100022040000-3" and "irqchip@0x0000300022040000-3" as following example.

    $ sudo cat /sys/kernel/debug/irq/irqs/163
    handler:  handle_fasteoi_irq
    device:   0019:00:00.0
         <SNIP>
    node:     1
    affinity: 72-143
    effectiv: 76
    domain:  irqchip@0x0000100022040000-3
     hwirq:   0xc8000000
     chip:    ITS-MSI
      flags:   0x20
    $ sudo cat /sys/kernel/debug/irq/irqs/174
    handler:  handle_fasteoi_irq
    device:   0039:00:00.0
        <SNIP>
    node:     3
    affinity: 216-287
    effectiv: 221
    domain:  irqchip@0x0000300022040000-3
     hwirq:   0xc8000000
     chip:    ITS-MSI
      flags:   0x20

We could not detect the duplicated hwirq number (0xc8000000) in this case.


Cosmetic nits:

   - Tweak subject to match history (use "git log --oneline
     tools/testing/selftests/drivers/" to see it), e.g.,

       selftests: irq: Add check for duplicate hwirq

   - Rewrap commit log to fill 75 columns.  No point in using shorter
     lines.

   - Indent the "$ sudu cat ..." block by a couple spaces since it's
     effectively a quotation, not part of the main text body.

   - Possibly include sample output of irq-check.sh (also indented as a
     quote) when run on the system where you manually found the
     duplicate via "sudo cat /sys/kernel/debug/irq/irqs/..."

   - Reword "The irq-check.sh can help ..." to something like this:

       Add an irq-check.sh test to report errors when there are
       duplicate hwirqs per chip name.

   - Since the kernel patch has already been merged, cite it like this
     instead of using the https://lore URL:

       db744ddd59be ("PCI/MSI: Prevent MSI hardware interrupt number truncation")


If you agree to use irq chip name ("ITS-MSI") to scan duplicate hwirq, I
could send version 2 patch to fix above suggestions.


Thank you,
Joseph.

Signed-off-by: Joseph Jang <jjang@xxxxxxxxxx>
Reviewed-by: Matthew R. Ochs <mochs@xxxxxxxxxx>
---
  tools/testing/selftests/drivers/irq/Makefile  |  5 +++
  tools/testing/selftests/drivers/irq/config    |  2 +
  .../selftests/drivers/irq/irq-check.sh        | 39 +++++++++++++++++++
  3 files changed, 46 insertions(+)
  create mode 100644 tools/testing/selftests/drivers/irq/Makefile
  create mode 100644 tools/testing/selftests/drivers/irq/config
  create mode 100755 tools/testing/selftests/drivers/irq/irq-check.sh

diff --git a/tools/testing/selftests/drivers/irq/Makefile b/tools/testing/selftests/drivers/irq/Makefile
new file mode 100644
index 000000000000..d6998017c861
--- /dev/null
+++ b/tools/testing/selftests/drivers/irq/Makefile
@@ -0,0 +1,5 @@
+# SPDX-License-Identifier: GPL-2.0
+
+TEST_PROGS := irq-check.sh
+
+include ../../lib.mk
diff --git a/tools/testing/selftests/drivers/irq/config b/tools/testing/selftests/drivers/irq/config
new file mode 100644
index 000000000000..a53d3b713728
--- /dev/null
+++ b/tools/testing/selftests/drivers/irq/config
@@ -0,0 +1,2 @@
+CONFIG_GENERIC_IRQ_DEBUGFS=y
+CONFIG_GENERIC_IRQ_INJECTION=y
diff --git a/tools/testing/selftests/drivers/irq/irq-check.sh b/tools/testing/selftests/drivers/irq/irq-check.sh
new file mode 100755
index 000000000000..e784777043a1
--- /dev/null
+++ b/tools/testing/selftests/drivers/irq/irq-check.sh
@@ -0,0 +1,39 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0
+
+# This script need root permission
+uid=$(id -u)
+if [ $uid -ne 0 ]; then
+	echo "SKIP: Must be run as root"
+	exit 4
+fi
+
+# Ensure debugfs is mounted
+mount -t debugfs nodev /sys/kernel/debug 2>/dev/null
+if [ ! -d "/sys/kernel/debug/irq/irqs" ]; then
+	echo "SKIP: irq debugfs not found"
+	exit 4
+fi
+
+# Traverse the irq debug file system directory to collect chip_name and hwirq
+hwirq_list=$(for irq_file in /sys/kernel/debug/irq/irqs/*; do
+	# Read chip name and hwirq from the irq_file
+	chip_name=$(cat "$irq_file" | grep -m 1 'chip:' | awk '{print $2}')
+	hwirq=$(cat "$irq_file" | grep -m 1 'hwirq:' | awk '{print $2}' )
+
+	if [ -z "$chip_name" ] || [ -z "$hwirq" ]; then
+		continue
+	fi
+
+	echo "$chip_name $hwirq"
+done)
+
+dup_hwirq_list=$(echo "$hwirq_list" | sort | uniq -cd)
+
+if [ -n "$dup_hwirq_list" ]; then
+	echo "ERROR: Found duplicate hwirq"
+	echo "$dup_hwirq_list"
+	exit 1
+fi
+
+exit 0
--
2.34.1






[Index of Archives]     [ARM Kernel]     [Linux ARM]     [Linux ARM MSM]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux