Hello, On Tue, Jul 11, 2023 at 04:06:22PM +0200, Geert Uytterhoeven wrote: > On Tue, Jul 11, 2023 at 3:55 PM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote: > > > > Hi Tejun, > > > > On Fri, May 12, 2023 at 9:54 PM Tejun Heo <tj@xxxxxxxxxx> wrote: > > > Workqueue now automatically marks per-cpu work items that hog CPU for too > > > long as CPU_INTENSIVE, which excludes them from concurrency management and > > > prevents stalling other concurrency-managed work items. If a work function > > > keeps running over the thershold, it likely needs to be switched to use an > > > unbound workqueue. > > > > > > This patch adds a debug mechanism which tracks the work functions which > > > trigger the automatic CPU_INTENSIVE mechanism and report them using > > > pr_warn() with exponential backoff. > > > > > > v2: Drop bouncing through kthread_worker for printing messages. It was to > > > avoid introducing circular locking dependency but wasn't effective as it > > > still had pool lock -> wci_lock -> printk -> pool lock loop. Let's just > > > print directly using printk_deferred(). > > > > > > Signed-off-by: Tejun Heo <tj@xxxxxxxxxx> > > > Suggested-by: Peter Zijlstra <peterz@xxxxxxxxxxxxx> > > > > Thanks for your patch, which is now commit 6363845005202148 > > ("workqueue: Report work funcs that trigger automatic CPU_INTENSIVE > > mechanism") in v6.5-rc1. > > > > I guess you are interested to know where this triggers. > > I enabled CONFIG_WQ_CPU_INTENSIVE_REPORT=y, and tested > > the result on various machines... > > > OrangeCrab/Linux-on-LiteX-VexRiscV with ht16k33 14-seg display and ssd130xdrmfb: > > > > workqueue: check_lifetime hogged CPU for >10000us 4 times, consider > > switching to WQ_UNBOUND > > workqueue: drm_fb_helper_damage_work hogged CPU for >10000us 1024 > > times, consider switching to WQ_UNBOUND > > workqueue: fb_flashcursor hogged CPU for >10000us 128 times, > > consider switching to WQ_UNBOUND > > workqueue: ht16k33_seg14_update hogged CPU for >10000us 128 times, > > consider switching to WQ_UNBOUND > > workqueue: mmc_rescan hogged CPU for >10000us 128 times, consider > > switching to WQ_UNBOUND > > Got one more after a while: > > workqueue: neigh_managed_work hogged CPU for >10000us 4 times, > consider switching to WQ_UNBOUND I wonder whether the right thing to do here is somehow scaling the threshold according to the relative processing power. It's difficult to come up with a threshold which works well across the latest & fastest and really tiny CPUs. I'll think about it some more but if you have some ideas, please feel free to suggest. Thanks. -- tejun