Hi, On Mon, Mar 25, 2024 at 2:48 AM Bitao Hu <yaoma@xxxxxxxxxxxxxxxxx> wrote: > > Hi, Thomas > > On 2024/3/24 04:43, Thomas Gleixner wrote: > > On Wed, Mar 06 2024 at 20:52, Bitao Hu wrote: > >> + if (__this_cpu_read(snapshot_taken)) { > >> + for_each_active_irq(i) { > >> + count = kstat_get_irq_since_snapshot(i); > >> + tabulate_irq_count(irq_counts_sorted, i, count, NUM_HARDIRQ_REPORT); > >> + } > >> + > >> + /* > >> + * We do not want the "watchdog: " prefix on every line, > >> + * hence we use "printk" instead of "pr_crit". > >> + */ > > > > You are not providing any justification why the prefix is not > > wanted. Just saying 'We do not want' does not cut it and who is 'We'. I > > certainly not. > > > > I really disagree because the prefixes are very useful for searching log > > files. So not having it makes it harder to filter out for no reason. > > > > > Regarding the use of printk() instead of pr_crit(), I have had a > discussion with Liu Song and Douglas in PATCHv1: > https://lore.kernel.org/all/CAD=FV=WEEQeKX=ec3Gr-8CKs2K0MaWN3V0-0yOsuret0qcB_AA@xxxxxxxxxxxxxx/ > > Please allow me to elaborate on my reasoning. The purpose of the > report_cpu_status() function I implemented is similar to that of > print_modules(), show_regs(), and dump_stack(). These functions are > designed to assist in analyzing the causes of a soft lockup, rather > than to report that a soft lockup has occurred. Therefore, I think > that adding the "watchdog: " prefix to every line is redundant and > not concise. Besides, the existing pr_emerg() in the watchdog.c file > is already sufficient for searching useful information in the logs. > The information I added, along with the call tree and other data, is > located near the line with the "watchdog: " prefix. > > Are the two reasons I've provided reasonable? FWIW I don't feel super strongly about this, but I'm leaning towards agreeing with Bitao. The sample output from the commit message looks like this: [ 638.870231] watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [swapper/9:0] [ 638.870825] CPU#9 Utilization every 4s during lockup: [ 638.871194] #1: 0% system, 0% softirq, 100% hardirq, 0% idle [ 638.871652] #2: 0% system, 0% softirq, 100% hardirq, 0% idle [ 638.872107] #3: 0% system, 0% softirq, 100% hardirq, 0% idle [ 638.872563] #4: 0% system, 0% softirq, 100% hardirq, 0% idle [ 638.873018] #5: 0% system, 0% softirq, 100% hardirq, 0% idle [ 638.873494] CPU#9 Detect HardIRQ Time exceeds 50%. Most frequent HardIRQs: [ 638.873994] #1: 330945 irq#7 [ 638.874236] #2: 31 irq#82 [ 638.874493] #3: 10 irq#10 [ 638.874744] #4: 2 irq#89 [ 638.874992] #5: 1 irq#102 ...and in my mind the "watchdog: BUG: soft lockup - CPU#9 stuck for 26s! [swapper/9:0]" line is enough to grep through the dmesg. Having all the following lines start with "watchdog:" feels like overkill to me, but if you feel strongly that they should then it wouldn't bother me too much for them all to have the "watchdog:" prefix. Could you clarify how strongly you feel about this and whether Bitao should spin a v13? I believe that this is the only point of contention on the patch series right now and otherwise it could be ready to land. I know in the past Petr has wanted ample time to comment though perhaps the fact that it's been ~1 month is enough. Petr: do you have anything that needs saying before this patch series lands? Thanks! -Doug