Re: [linus:master] [mm] c0bff412e6: stress-ng.clone.ops_per_sec -2.9% regression

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 14.08.24 06:10, Mateusz Guzik wrote:
On Wed, Aug 14, 2024 at 5:02 AM Yin Fengwei <fengwei.yin@xxxxxxxxx> wrote:

On 8/13/24 03:14, Mateusz Guzik wrote:
would you mind benchmarking the change which merely force-inlines _compund_page?

https://lore.kernel.org/linux-mm/66c4fcc5-47f6-438c-a73a-3af6e19c3200@xxxxxxxxxx/
This change can resolve the regression also:

Great, thanks.

David, I guess this means it would be fine to inline the entire thing
at least from this bench standpoint. Given that this is your idea I
guess you should do the needful(tm)? :)

Testing

diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 5769fe6e4950..25e25b34f4a0 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -235,7 +235,7 @@ static __always_inline int page_is_fake_head(const struct page *page)
        return page_fixed_fake_head(page) != page;
 }
-static inline unsigned long _compound_head(const struct page *page)
+static __always_inline unsigned long _compound_head(const struct page *page)
 {
        unsigned long head = READ_ONCE(page->compound_head);
With a kernel-config based on something derived from Fedora
config-6.8.9-100.fc38.x86_64 for convenience with

CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP=y

add/remove: 15/14 grow/shrink: 79/87 up/down: 12836/-13917 (-1081)
Function                                     old     new   delta
change_pte_range                               -    2308   +2308
iommu_put_dma_cookie                         454    1276    +822
get_hwpoison_page                           2007    2580    +573
end_bbio_data_read                          1171    1626    +455
end_bbio_meta_read                           492     934    +442
ext4_finish_bio                              773    1208    +435
fq_ring_free_locked                          128     541    +413
end_bbio_meta_write                          493     872    +379
gup_fast_fallback                           4207    4568    +361
v1_free_pgtable                              166     519    +353
iommu_v1_map_pages                          2747    3098    +351
end_bbio_data_write                          609     960    +351
fsverity_verify_bio                          334     656    +322
follow_page_mask                            3399    3719    +320
__read_end_io                                316     635    +319
btrfs_end_super_write                        494     789    +295
iommu_alloc_pages_node.constprop             286     572    +286
free_buffers.part                              -     271    +271
gup_must_unshare                               -     268    +268
smaps_pte_range                             1285    1513    +228
pagemap_pmd_range                           2189    2393    +204
iommu_alloc_pages_node                         -     193    +193
smaps_hugetlb_range                          705     897    +192
follow_page_pte                             1584    1758    +174
__migrate_device_pages                      2435    2595    +160
unpin_user_pages_dirty_lock                  205     362    +157
_compound_head                                 -     150    +150
unpin_user_pages                             143     282    +139
put_ref_page.part                              -     126    +126
iomap_finish_ioend                           866     972    +106
iomap_read_end_io                            673     763     +90
end_bbio_meta_read.cold                       42     131     +89
btrfs_do_readpage                           1759    1845     +86
extent_write_cache_pages                    2133    2212     +79
end_bbio_data_write.cold                      32     108     +76
end_bbio_meta_write.cold                      40     108     +68
__read_end_io.cold                            25      91     +66
btrfs_end_super_write.cold                    25      89     +64
ext4_finish_bio.cold                         118     178     +60
fsverity_verify_bio.cold                      25      84     +59
block_write_begin                            217     274     +57
end_bbio_data_read.cold                      378     426     +48
__pfx__compound_head                           -      48     +48
copy_hugetlb_page_range                     3050    3097     +47
lruvec_stat_mod_folio.constprop              585     630     +45
iomap_finish_ioend.cold                      163     202     +39
md_bitmap_file_unmap                         150     187     +37
free_pgd_range                              1949    1985     +36
prep_move_freepages_block                    319     349     +30
iommu_alloc_pages_node.cold                    -      25     +25
iomap_read_end_io.cold                        65      89     +24
zap_huge_pmd                                 874     897     +23
cont_write_begin.cold                        108     130     +22
skb_splice_from_iter                         822     843     +21
set_pmd_migration_entry                     1037    1058     +21
zerocopy_fill_skb_from_iter                 1321    1340     +19
pagemap_scan_pmd_entry                      3261    3279     +18
try_grab_folio_fast                          452     469     +17
change_huge_pmd                             1174    1191     +17
folio_put                                     48      64     +16
__pfx_set_p4d                                  -      16     +16
__pfx_put_ref_page.part                        -      16     +16
__pfx_lruvec_stat_mod_folio.constprop        208     224     +16
__pfx_iommu_alloc_pages_node.constprop        16      32     +16
__pfx_iommu_alloc_pages_node                   -      16     +16
__pfx_gup_must_unshare                         -      16     +16
__pfx_free_buffers.part                        -      16     +16
__pfx_folio_put                               48      64     +16
__pfx_change_pte_range                         -      16     +16
__pfx___pte                                   32      48     +16
offline_pages                               1962    1975     +13
memfd_pin_folios                            1284    1297     +13
uprobe_write_opcode                         2062    2073     +11
set_p4d                                        -      11     +11
__pte                                         22      33     +11
copy_page_from_iter_atomic                  1714    1724     +10
__migrate_device_pages.cold                   60      70     +10
try_to_unmap_one                            3355    3364      +9
try_to_migrate_one                          3310    3319      +9
stable_page_flags                           1034    1043      +9
io_sqe_buffer_register                      1404    1413      +9
dio_zero_block                               644     652      +8
add_ra_bio_pages.constprop.isra             1542    1550      +8
__add_to_kill                                969     977      +8
btrfs_writepage_fixup_worker                1199    1206      +7
write_protect_page                          1186    1192      +6
iommu_v2_map_pages.cold                      145     151      +6
gup_fast_fallback.cold                       112     117      +5
try_to_merge_one_page                       1857    1860      +3
__apply_to_page_range                       2235    2238      +3
wbc_account_cgroup_owner                     217     219      +2
change_protection.cold                       105     107      +2
can_change_pte_writable                      354     356      +2
vmf_insert_pfn_pud                           699     700      +1
split_huge_page_to_list_to_order.cold        152     151      -1
pte_pfn                                       40      39      -1
move_pages                                  5270    5269      -1
isolate_single_pageblock                    1056    1055      -1
__apply_to_page_range.cold                    92      91      -1
unmap_page_range.cold                         88      86      -2
do_huge_pmd_numa_page                       1175    1173      -2
free_pgd_range.cold                          162     159      -3
copy_page_to_iter                            329     326      -3
copy_page_range.cold                         149     146      -3
copy_page_from_iter                          307     304      -3
can_finish_ordered_extent                    551     548      -3
__replace_page                              1133    1130      -3
__reset_isolation_pfn                        645     641      -4
dio_send_cur_page                           1113    1108      -5
__access_remote_vm                          1010    1005      -5
pagemap_hugetlb_category                     468     459      -9
extent_write_locked_range                   1148    1139      -9
unuse_pte_range                             1834    1821     -13
do_migrate_range                            1935    1922     -13
__get_user_pages                            1952    1938     -14
migrate_vma_collect_pmd                     2817    2802     -15
copy_page_to_iter_nofault                   2373    2358     -15
hugetlb_fault                               4054    4038     -16
__pfx_shake_page                              16       -     -16
__pfx_put_page                                16       -     -16
__pfx_pfn_swap_entry_to_page                  32      16     -16
__pfx_gup_must_unshare.part                   16       -     -16
__pfx_gup_folio_next                          16       -     -16
__pfx_free_buffers                            16       -     -16
__pfx___get_unpoison_page                     16       -     -16
btrfs_cleanup_ordered_extents                622     604     -18
read_rdev                                    694     673     -21
isolate_migratepages_block.cold              222     197     -25
hugetlb_mfill_atomic_pte                    1869    1844     -25
folio_pte_batch.constprop                   1020     995     -25
hugetlb_reserve_pages                       1468    1441     -27
__alloc_fresh_hugetlb_folio                  676     649     -27
intel_pasid_alloc_table.cold                  83      52     -31
__pfx_iommu_put_pages_list                    48      16     -32
__pfx_PageHuge                                32       -     -32
__blockdev_direct_IO.cold                    952     920     -32
io_ctl_prepare_pages                         832     794     -38
__handle_mm_fault                           4237    4195     -42
finish_fault                                1007     962     -45
__pfx_pfn_swap_entry_folio                    64      16     -48
vm_normal_folio_pmd                           84      34     -50
vm_normal_folio                               84      34     -50
set_migratetype_isolate                     1429    1375     -54
do_set_pmd                                   618     561     -57
can_change_pmd_writable                      293     229     -64
__unmap_hugepage_range                      2389    2325     -64
do_fault                                    1187    1121     -66
fault_dirty_shared_page                      425     358     -67
madvise_free_huge_pmd                        863     792     -71
insert_page_into_pte_locked.isra             502     429     -73
restore_exclusive_pte                        539     463     -76
isolate_migratepages_block                  5436    5355     -81
__do_fault                                   366     276     -90
set_pte_range                                593     502     -91
follow_devmap_pmd                            559     468     -91
__pfx_bio_first_folio                        144      48     -96
shake_page                                   105       -    -105
hugetlb_change_protection                   2314    2204    -110
hugetlb_wp                                  2134    2017    -117
__blockdev_direct_IO                        5063    4946    -117
skb_tx_error                                 272     149    -123
put_page                                     123       -    -123
gup_must_unshare.part                        135       -    -135
PageHuge                                     136       -    -136
ksm_scan_thread                             9172    9032    -140
intel_pasid_alloc_table                      596     447    -149
copy_huge_pmd                               1539    1385    -154
skb_split                                   1534    1376    -158
split_huge_pmd_locked                       4024    3865    -159
skb_append_pagefrags                         663     504    -159
memory_failure                              2784    2624    -160
unpoison_memory                             1328    1167    -161
cont_write_begin                             959     793    -166
pfn_swap_entry_to_page                       250      82    -168
skb_pp_cow_data                             1539    1367    -172
gup_folio_next                               180       -    -180
intel_pasid_get_entry.isra                   607     425    -182
v2_alloc_pgtable                             309     126    -183
do_huge_pmd_wp_page                         1173     988    -185
bio_first_folio.cold                         315     105    -210
unmap_page_range                            6091    5873    -218
split_huge_page_to_list_to_order            4141    3905    -236
move_pages_huge_pmd                         2053    1813    -240
free_buffers                                 286       -    -286
iommu_v2_map_pages                          1722    1428    -294
soft_offline_page                           2149    1843    -306
do_wp_page                                  3340    2993    -347
do_swap_page                                4619    4265    -354
md_import_device                            1002     635    -367
copy_page_range                             7436    7040    -396
__get_unpoison_page                          415       -    -415
pfn_swap_entry_folio                         596     149    -447
iommu_put_pages_list                        1071     344    -727
bio_first_folio                             2322     774   -1548
change_protection                           5008    2790   -2218
Total: Before=32786363, After=32785282, chg -0.00%


--
Cheers,

David / dhildenb





[Index of Archives]     [Linux ARM Kernel]     [Linux ARM]     [Linux Omap]     [Fedora ARM]     [IETF Annouce]     [Bugtraq]     [Linux OMAP]     [Linux MIPS]     [eCos]     [Asterisk Internet PBX]     [Linux API]

  Powered by Linux