On 2023/2/27 21:42, Ammar Faizi wrote:
On Mon, Feb 27, 2023 at 06:18:43PM +0800, Qu Wenruo wrote:
I'm not sure if pinning the wq is really the best way to your problem.
Yes, I understand you want to limit the CPU usage of btrfs workqueues, but
have you tried "thread_pool=" mount option?
That mount option should limit the max amount of in-flight work items, thus
at least limit the CPU usage.
I have tried to use the thread_poll=%u mount option previously. But I
didn't observe the effect intensively. I'll try to play with this option
more and see if it can yield the desired behavior.
The thread_pool mount option is much harder to observe the behavior change.
As wq pinned to one or two CPUs is easy to observe using htop, while the
unbounded wq, even with thread_pool, is much harder to observe.
Thus it needs more systematic testing to find the difference.
For the wq CPU pinning part, I'm not sure if it's really needed, although
it's known CPU pinning can affect some performance characteristics.
What I like about CPU pinning is that we can dedicate CPUs for specific
workloads so it won't cause scheduling noise to the app we've dedicated
other CPUs for.
I'm not 100% sure if we're really any better than the scheduler
developers, as there are a lot of more things to consider.
E.g. for recent Intel CPUs, they have BIG and little cores, and BIG
cores even have SMT supports.
For current kernels, scheduler would avoid putting workloads into the
threads sharing the same physical cores.
Thus it can result seemingly weird priority like BIG core thread1 >
little core > BIG core thread2.
But that results overall better performance.
So overall, unless necessary I'd avoid manual CPU pinning.
Thanks,
Qu