On Wed, Nov 20, 2024 at 10:03:54AM +0100, Peter Zijlstra wrote: > On Tue, Nov 19, 2024 at 04:30:02PM -0800, Chenbo Lu wrote: > > Hello, > > > > I am experiencing a significant performance degradation after > > upgrading my kernel from version 6.6 to 6.8 and would appreciate any > > insights or suggestions. > > > > I am running a high-load simulation system that spawns more than 1000 > > threads and the overall CPU usage is 30%+ . Most of the threads are > > using real-time > > scheduling (SCHED_RR), and the threads of a model are using > > SCHED_DEADLINE. After upgrading the kernel, I noticed that the > > execution time of my model has increased from 4.5ms to 6ms. > > > > What I Have Done So Far: > > 1. I found this [bug > > report](https://bugzilla.kernel.org/show_bug.cgi?id=219366#c7) and > > reverted the commit efa7df3e3bb5da8e6abbe37727417f32a37fba47 mentioned > > in the post. Unfortunately, this did not resolve the issue. > > 2. I performed a git bisect and found that after these two commits > > related to scheduling (RT and deadline) were merged, the problem > > happened. They are 612f769edd06a6e42f7cd72425488e68ddaeef0a, > > 5fe7765997b139e2d922b58359dea181efe618f9 > > And yet you failed to Cc Valentin, the author of said commits :/ > > > After reverting these two commits, the model execution time improved > > to around 5 ms. > > 3. I revert two more commits, and the execution time is back to 4.7ms: > > 63ba8422f876e32ee564ea95da9a7313b13ff0a1, > > efa7df3e3bb5da8e6abbe37727417f32a37fba47 > > > > My questions are: > > 1.Has anyone else experienced similar performance degradation after > > upgrading to kernel 6.8? > > This is 4 kernel releases back, I my memory isn't that long. > > > 2.Can anyone explain why these two commits are causing the problem? I > > am not very familiar with the kernel code and would appreciate any > > insights. > > There might be a race window between setting the tro and sending the > IPI, such that previously the extra IPIs would sooner find the newly > pushable task. > > Valentin, would it make sense to set tro before enqueueing the pushable, > instead of after it? s/tro/rto/ clearly I'm consistently not capable of typing that :-)