On Fri, Sep 06, 2024 at 06:19:08AM +0800, Hillf Danton wrote: > On Tue, 23 Jul 2024 14:14:34 -0300 Marcelo Tosatti <mtosatti@xxxxxxxxxx> > > On Sat, Jun 22, 2024 at 12:58:08AM -0300, Leonardo Bras wrote: > > > The problem: > > > Some places in the kernel implement a parallel programming strategy > > > consisting on local_locks() for most of the work, and some rare remote > > > operations are scheduled on target cpu. This keeps cache bouncing low since > > > cacheline tends to be mostly local, and avoids the cost of locks in non-RT > > > kernels, even though the very few remote operations will be expensive due > > > to scheduling overhead. > > > > > > On the other hand, for RT workloads this can represent a problem: getting > > > an important workload scheduled out to deal with remote requests is > > > sure to introduce unexpected deadline misses. > > > > Another hang with a busy polling workload (kernel update hangs on > > grub2-probe): > > > > [342431.665417] INFO: task grub2-probe:24484 blocked for more than 622 seconds. > > [342431.665458] Tainted: G W X ------- --- 5.14.0-438.el9s.x86_64+rt #1 > > [342431.665488] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > > [342431.665515] task:grub2-probe state:D stack:0 pid:24484 ppid:24455 flags:0x00004002 > > [342431.665523] Call Trace: > > [342431.665525] <TASK> > > [342431.665527] __schedule+0x22a/0x580 > > [342431.665537] schedule+0x30/0x80 > > [342431.665539] schedule_timeout+0x153/0x190 > > [342431.665543] ? preempt_schedule_thunk+0x16/0x30 > > [342431.665548] ? preempt_count_add+0x70/0xa0 > > [342431.665554] __wait_for_common+0x8b/0x1c0 > > [342431.665557] ? __pfx_schedule_timeout+0x10/0x10 > > [342431.665560] __flush_work.isra.0+0x15b/0x220 > > The fresh new flush_percpu_work() is nop with CONFIG_PREEMPT_RT enabled, why > are you testing it with 5.14.0-438.el9s.x86_64+rt instead of mainline? Or what > are you testing? > > BTW the hang fails to show the unexpected deadline misses. I think he is showing a client case in which my patchset would be helpful, and avoid those stalls in RT=y. > > > [342431.665565] ? __pfx_wq_barrier_func+0x10/0x10 > > [342431.665570] __lru_add_drain_all+0x17d/0x220 > > [342431.665576] invalidate_bdev+0x28/0x40 > > [342431.665583] blkdev_common_ioctl+0x714/0xa30 > > [342431.665588] ? bucket_table_alloc.isra.0+0x1/0x150 > > [342431.665593] ? cp_new_stat+0xbb/0x180 > > [342431.665599] blkdev_ioctl+0x112/0x270 > > [342431.665603] ? security_file_ioctl+0x2f/0x50 > > [342431.665609] __x64_sys_ioctl+0x87/0xc0 >