Inlined functions in radeon_object.h

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jerome,

What was the rationale for inlining functions radeon_bo_reserve() and
radeon_bo_wait()? These functions are called many times (especially
radeon_bo_reserve, called 47 times) and they aren't particularly small
(especially radeon_bo_wait). I found that un-inlining radeon_bo_wait()
saves 187 bytes of code on radeon.o on x86-64:

add/remove: 1/0 grow/shrink: 0/3 up/down: 178/-365 (-187)
function                                     old     new   delta
radeon_bo_wait                                 -     178    +178
radeon_gem_wait_idle_ioctl                   257     157    -100
radeon_gem_busy_ioctl                        298     187    -111
radeon_gem_set_domain                        251      97    -154

And un-inlining radeon_bo_reserve() saves 2644 bytes of code on radeon.o
on x86-64 (2947 if you include the string section):

add/remove: 1/0 grow/shrink: 0/38 up/down: 100/-2744 (-2644)
function                                     old     new   delta
radeon_bo_reserve                              -     100    +100
rs600_gart_disable                           178     143     -35
radeon_gart_table_vram_free                  142      94     -48
r600_wb_enable                               657     609     -48
r600_ih_ring_fini                            164     116     -48
radeon_ring_fini                             189     139     -50
radeon_ib_pool_fini                          221     171     -50
r600_suspend                                 231     181     -50
rv770_fini                                   262     210     -52
r600_blit_fini                               146      94     -52
r600_wb_disable                              194     141     -53
radeon_gem_get_tiling_ioctl                  263     209     -54
radeon_bo_set_tiling_flags                   151      97     -54
rv770_suspend                                231     175     -56
radeon_suspend_kms                           540     481     -59
radeon_fb_find_or_create_single             1597    1538     -59
rv770_pcie_gart_disable                      958     898     -60
rv370_pcie_gart_disable                      947     887     -60
radeon_ring_init                             382     322     -60
r600_pcie_gart_disable                      1405    1345     -60
evergreen_pcie_gart_disable                 1020     960     -60
radeon_gem_object_unpin                      115      54     -61
radeon_ttm_fini                              283     220     -63
radeon_ttm_init                              997     929     -68
radeon_gem_object_pin                        158      90     -68
radeon_gart_table_vram_pin                   243     174     -69
rv770_startup                              11065   10995     -70
radeon_bo_list_reserve                       126      56     -70
r600_blit_init                               850     779     -71
radeonfb_destroy_pinned_object               178     106     -72
r600_startup                                3235    3163     -72
radeon_ib_pool_init                          546     470     -76
r600_irq_init                               2648    2572     -76
r100_wb_fini                                 281     197     -84
r100_wb_init                                 580     488     -92
radeon_test_moves                           1524    1429     -95
radeon_crtc_set_base                        1774    1668    -106
atombios_crtc_set_base                      4477    4248    -229
radeon_benchmark_move                       1208     974    -234

Would you take a patch un-inlining either or both of these functions?

For radeon_bo_reserve(), an alternative would be to remove the error
message. After all, we just decided that the same error message was
needless in radeon_bo_wait(), maybe the same reasoning applies to
radeon_bo_reserve(), in which case the function would become a one-liner
which we can legitimately keep inlined. The binary size benefit is
slightly smaller (-2294 bytes on x86-64) but the code would be slightly
faster (one function call saved.) What do you think?

-- 
Jean Delvare
_______________________________________________
dri-devel mailing list
dri-devel@xxxxxxxxxxxxxxxxxxxxx
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[Index of Archives]     [Linux DRI Users]     [Linux Intel Graphics]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]     [XFree86]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [XFree86]
  Powered by Linux