Re: [PATCHv9 2/3] irq: use a struct for the kstat_irqs in the interrupt descriptor

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

On 2024/2/22 21:22, Thomas Gleixner wrote:
On Thu, Feb 22 2024 at 17:34, Bitao Hu wrote:

First of all the subsystem prefix is 'genirq:'. 'git log kernel/irq/'
gives you a pretty good hint. It's documented....

Secondly the subject line does not match what this patch is about. It's
not about using a struct, it's about providing a snapshot mechanism, no?

The current implementation uses an int for the kstat_irqs in the
interrupt descriptor.

However, we need to know the number of interrupts which happened
since softlockup detection took a snapshot in order to analyze
the problem caused by an interrupt storm.

Replacing an int with a struct and providing sensible interfaces
for the watchdog code can keep it self contained to the interrupt
core code.

So something like this makes a useful change log for this:

  Subject: genirq: Provide a snapshot mechanism for interrupt statistics

  The soft lockup detector lacks a mechanism to identify interrupt storms
  as root cause of a lockup. To enable this the detector needs a
  mechanism to snapshot the interrupt count statistics on a CPU when the
  detector observes a potential lockup scenario and compare that against
  the interrupt count when it warns about the lockup later on. The number
  of interrupts in that period give a hint whether the lockup might be
  caused by an interrupt storm.

  Instead of having extra storage in the lockup detector and accessing
  the internals of the interrupt descriptor directly, convert the per CPU
  irq_desc::kstat_irq member to a data structure which contains the
  counter plus a snapshot member and provide interfaces to take a
  snapshot of all interrupts on the current CPU and to retrieve the delta
  of a specific interrupt later on.

Thanks, the changelog you wrote very clearly articulates the purpose of
this patch.

Hmm?

Signed-off-by: Bitao Hu <yaoma@xxxxxxxxxxxxxxxxx>

Interesting. You fully authored the patch?

That's not how it works. You cannot take work from others and claim that
it is yours. The minimal courtesy is to add a 'Originally-by:' tag.

I'm very sorry, the majority of this patch is your work, I will add an
'Originally-by:' tag.

diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 623b8136e9af..3ad40cf30c66 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -488,18 +488,15 @@ int show_interrupts(struct seq_file *p, void *v)
  	if (!desc || irq_settings_is_hidden(desc))
  		goto outsparse;
- if (desc->kstat_irqs) {
-		for_each_online_cpu(j)
-			any_count |= data_race(*per_cpu_ptr(desc->kstat_irqs, j));
-	}
+	if (desc->kstat_irqs)
+		any_count = data_race(desc->tot_count);

This is an unrelated change and needs to be split out into a separate
patch with a proper changelog which explains why this is equivalent.


Alright, I will remove this change witch is not related to the purpose
of this patch.

I guess that the purpose of suggesting this change in your V1 response
was to speedup the 'show_interrupts'. However, after reviewing the
usage of 'desc->tot_count' in 'unsigned int kstat_irqs(unsigned int irq)', I think the change might be as follows:

diff --git a/kernel/irq/proc.c b/kernel/irq/proc.c
index 623b8136e9af..53b8d6edd7ac 100644
--- a/kernel/irq/proc.c
+++ b/kernel/irq/proc.c
@@ -489,8 +489,13 @@ int show_interrupts(struct seq_file *p, void *v)
                goto outsparse;

        if (desc->kstat_irqs) {
-               for_each_online_cpu(j)
- any_count |= data_race(per_cpu(desc->kstat_irqs->cnt, j));
+               if (!irq_settings_is_per_cpu_devid(desc) &&
+                   !irq_settings_is_per_cpu(desc) &&
+                   !irq_is_nmi(desc))
+                       any_count = data_race(desc->tot_count);
+               else
+                       for_each_online_cpu(j)
+ any_count |= data_race(per_cpu(desc->kstat_irqs->cnt, j));
        }

        if ((!desc->action || irq_desc_is_chained(desc)) && !any_count)

Is my idea correct?

Best Regards,
	Bitao




[Index of Archives]     [LKML Archive]     [Linux ARM Kernel]     [Linux ARM]     [Git]     [Yosemite News]     [Linux SCSI]     [Linux Hams]

  Powered by Linux