Hi,
在 2024/03/27 3:49, Florian-Ewald Müller 写道:
Hi Kuai,
here (attached) the promised simple patch.
You can send this patch, LGTM.
Thanks,
Kuai
Best,
Florian
On Tue, Mar 26, 2024 at 4:51 PM Florian-Ewald Müller
<florian-ewald.mueller@xxxxxxxxx> wrote:
Hello Kuai,
Thank you for your comments and questions.
About our tests:
- we used many (>= 100) logical volumes (each 100 GiB big) on top of 3
x md-raid5s (making up a raid50) on 8 SSDs each.
- further we started a fio instance doing random rd/wr (also of random
sizes) on each of those volumes,
And, yes indeed, as you suggested, we observed some performance change
already after adding (first) wq_has_sleeper().
The fio IOPS results improved by around 3-4% - but, in my experience,
every fluctuation <= 3% can also have other reasons (e.g. different
timings, temperature on SSDs, etc.).
Only afterwards, I've also changed the wait queue construct with the
swait variant (sufficient here), as its wake up path is simpler,
faster and with a smaller stack footprint.
Also used the (now since some years available) IDLE state for the
waiting thread to eliminate the (not anymore necessary) checking and
flushing of signals.
This second round of changes, although theoretically an improvement,
didn't bring any measurable performance increase, though.
This was more or less expected.
If you think, the simple adding of the wq_has_sleeper() is more
justifiable, please apply only this change.
Later today, I'll also send you a patch containing only this simple change.
Best regards,
Florian
On Tue, Mar 26, 2024 at 3:09 PM Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
Hi,
在 2024/03/26 13:22, Jinpu Wang 写道:
Hi Song, hi Kuai,
ping, Any comments?
Thx!
On Thu, Mar 7, 2024 at 1:08 PM Jack Wang <jinpu.wang@xxxxxxxxx> wrote:
From: Florian-Ewald Mueller <florian-ewald.mueller@xxxxxxxxx>
Replace md_thread's wait_event()/wake_up() related calls with their
simple swait~ variants to improve performance as well as memory and
stack footprint.
Use the IDLE state for the worker thread put to sleep instead of
misusing the INTERRUPTIBLE state combined with flushing signals
just for not contributing to system's cpu load-average stats.
Also check for sleeping thread before attempting its wake_up in
md_wakeup_thread() for avoiding unnecessary spinlock contention.
I think it'll be better to split this into a seperate patch.
And can you check if we just add wq_has_sleeper(), will there be
performance improvement?
With this patch (backported) on a kernel 6.1, the IOPS improved
by around 4% with raid1 and/or raid5, in high IO load scenarios.
Can you be more specifical about your test? because from what I know,
IO fast path doesn't involved with daemon thread, and I don't understand
yet where the 4% improvement is from.
Thanks,
Kuai