Re: [PATCH v4 00/10] sched/psi: some optimizations and extensions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hello,

Could this series be merged into the linux-next?

Thanks.


On 2022/8/26 00:41, Chengming Zhou wrote:
> Hi all,
> 
> This patch series are some optimizations and extensions for PSI.
> 
> patch 1/10 fix periodic aggregation shut off problem introduced by earlier
> commit 4117cebf1a9f ("psi: Optimize task switch inside shared cgroups").
> 
> patch 2-4 are some misc optimizations, so put them in front of this series.
> 
> patch 5/10 optimize task switch inside shared cgroups when in_memstall status
> of prev task and next task are different.
> 
> patch 6/10 remove NR_ONCPU task accounting to save 4 bytes in the first
> cacheline to be used by the following patch 7/10, which introduce new
> PSI resource PSI_IRQ to track IRQ/SOFTIRQ pressure stall information.
> 
> patch 8-9 cache parent psi_group in struct psi_group to speed up the
> hot iteration path.
> 
> patch 10/10 introduce a per-cgroup interface "cgroup.pressure" to disable
> or re-enable PSI in the cgroup level, and we implement hiding and unhiding
> the pressure files per Tejun's suggestion[1], which depends on his work[2].
> 
> [1] https://lore.kernel.org/all/YvqjhqJQi2J8RG3X@xxxxxxxxxxxxxxx/
> [2] https://lore.kernel.org/all/20220820000550.367085-1-tj@xxxxxxxxxx/
> 
> Performance test using mmtests/config-scheduler-perfpipe in
> /user.slice/user-0.slice/session-4.scope:
> 
>                                  next                patched       patched/only-leaf
> Min       Time        8.82 (   0.00%)        8.49 (   3.74%)        8.00 (   9.32%)
> 1st-qrtle Time        8.90 (   0.00%)        8.58 (   3.63%)        8.05 (   9.58%)
> 2nd-qrtle Time        8.94 (   0.00%)        8.61 (   3.65%)        8.09 (   9.50%)
> 3rd-qrtle Time        8.99 (   0.00%)        8.65 (   3.75%)        8.15 (   9.35%)
> Max-1     Time        8.82 (   0.00%)        8.49 (   3.74%)        8.00 (   9.32%)
> Max-5     Time        8.82 (   0.00%)        8.49 (   3.74%)        8.00 (   9.32%)
> Max-10    Time        8.84 (   0.00%)        8.55 (   3.20%)        8.04 (   9.05%)
> Max-90    Time        9.04 (   0.00%)        8.67 (   4.10%)        8.18 (   9.51%)
> Max-95    Time        9.04 (   0.00%)        8.68 (   4.03%)        8.20 (   9.26%)
> Max-99    Time        9.07 (   0.00%)        8.73 (   3.82%)        8.25 (   9.11%)
> Max       Time        9.12 (   0.00%)        8.89 (   2.54%)        8.27 (   9.29%)
> Amean     Time        8.95 (   0.00%)        8.62 *   3.67%*        8.11 *   9.43%*
> 
> Big thanks to Johannes Weiner, Tejun Heo and Michal Koutný for your
> suggestions and review!
> 
> 
> Changes in v4:
>  - Collect Acked-by tags from Johannes Weiner.
>  - Add many clear comments and changelogs per Johannes Weiner.
>  - Replace for_each_psi_group() with better open-code.
>  - Change to use better names cgroup_pressure_show() and
>    cgroup_pressure_write().
>  - Change to use better name psi_cgroup_restart() and only
>    call it on enabling.
> 
> Changes in v3:
>  - Rebase on linux-next and reorder patches to put misc optimizations
>    patches in the front of this series.
>  - Drop patch "sched/psi: don't change task psi_flags when migrate CPU/group"
>    since it caused a little performance regression and it's just
>    code refactoring, so drop it.
>  - Don't define PSI_IRQ and PSI_IRQ_FULL when !CONFIG_IRQ_TIME_ACCOUNTING,
>    in which case they are not used.
>  - Add patch 8/10 "sched/psi: consolidate cgroup_psi()" make cgroup_psi()
>    can handle all cgroups including root cgroup, make patch 9/10 simpler.
>  - Rename interface to "cgroup.pressure" and add some explanation
>    per Michal's suggestion.
>  - Hide and unhide pressure files when disable/re-enable cgroup PSI,
>    depends on Tejun's work.
> 
> Changes in v2:
>  - Add Acked-by tags from Johannes Weiner. Thanks for review!
>  - Fix periodic aggregation wakeup for common ancestors in
>    psi_task_switch().
>  - Add patch 7/10 from Johannes Weiner, which remove NR_ONCPU
>    task accounting to save 4 bytes in the first cacheline.
>  - Remove "psi_irq=" kernel cmdline parameter in last version.
>  - Add per-cgroup interface "cgroup.psi" to disable/re-enable
>    PSI stats accounting in the cgroup level.
> 
> 
> Chengming Zhou (9):
>   sched/psi: fix periodic aggregation shut off
>   sched/psi: don't create cgroup PSI files when psi_disabled
>   sched/psi: save percpu memory when !psi_cgroups_enabled
>   sched/psi: move private helpers to sched/stats.h
>   sched/psi: optimize task switch inside shared cgroups again
>   sched/psi: add PSI_IRQ to track IRQ/SOFTIRQ pressure
>   sched/psi: consolidate cgroup_psi()
>   sched/psi: cache parent psi_group to speed up groups iterate
>   sched/psi: per-cgroup PSI accounting disable/re-enable interface
> 
> Johannes Weiner (1):
>   sched/psi: remove NR_ONCPU task accounting
> 
>  Documentation/admin-guide/cgroup-v2.rst |  23 ++
>  include/linux/cgroup-defs.h             |   3 +
>  include/linux/cgroup.h                  |   5 -
>  include/linux/psi.h                     |  12 +-
>  include/linux/psi_types.h               |  29 ++-
>  kernel/cgroup/cgroup.c                  | 106 ++++++++-
>  kernel/sched/core.c                     |   1 +
>  kernel/sched/psi.c                      | 280 +++++++++++++++++-------
>  kernel/sched/stats.h                    |   6 +
>  9 files changed, 362 insertions(+), 103 deletions(-)
> 



[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Security]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]     [Monitors]

  Powered by Linux