Re: [PATCH v2] timers: Optimize usleep_range()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jul 29, 2022 at 1:29 PM Christophe JAILLET
<christophe.jaillet@xxxxxxxxxx> wrote:
>
> Most of the time the 'min' and 'max' parameters of usleep_range() are
> constant. We can take advantage of it to pre-compute at compile time
> some values otherwise computer at run-time in usleep_range_state().
>
> Replace usleep_range_state() by a new __nsleep_range_delta_state() function
> that takes as parameters the pre-computed values.
>
> The main benefit is to save a few instructions, especially 2
> multiplications (x1000 when converting us to ns).
>
>
> Some hand simplified diff of the generated asm are given below. They were
> produced on a Intel(R) Core(TM) i7-3770, with gcc 11.2.0.
>
> drivers/clk/clk-si514.c (taken as an example)
> -----------------------
> In this driver we have:
>    usleep_range(10000, 12000);
>
> --- clk_before.asm      2022-07-29 21:49:05.702289425 +0200
> +++ clk_after.asm       2022-07-29 21:50:23.801548963 +0200
> @@ -972,8 +972,8 @@
>   ea0:  45 85 e4                test   %r12d,%r12d
>   ea3:  0f 88 f6 fc ff ff       js     b9f <si514_set_rate+0x9f>
>   ea9:  e8 00 00 00 00          call   eae <si514_set_rate+0x3ae>
> - eae:  be e0 2e 00 00          mov    $0x2ee0,%esi             ;     12.000
> - eb3:  bf 10 27 00 00          mov    $0x2710,%edi             ;     10.000
> + eae:  be 80 84 1e 00          mov    $0x1e8480,%esi           ;  2.000.000
> + eb3:  bf 80 96 98 00          mov    $0x989680,%edi           ; 10.000.000
>   eb8:  ba 02 00 00 00          mov    $0x2,%edx
>   ebd:  e8 00 00 00 00          call   ec2 <si514_set_rate+0x3c2>
>   ec2:  44 8b 74 24 30          mov    0x30(%rsp),%r14d
>
> The asm produced in the caller is mostly the same. Only constant values
> passed to usleep_range_state() or __nsleep_range_delta_state() are
> different. No other instructions or whatever is different.
>
>
> kernel/time/timer.c
> -------------------
> -0000000000000000 <usleep_range_state>:
> +0000000000000000 <__nsleep_range_delta_state>:
>   f3 0f 1e fa           endbr64
>   e8 00 00 00 00        call   ...
>   48 b8 00 00 00 00 00  movabs $0xdffffc0000000000,%rax
> @@ -10692,16 +10692,14 @@
>   41 56                 push   %r14
>   49 c7 c6 00 00 00 00  mov    $0x0,%r14
>   41 55                 push   %r13
> - 41 89 d5              mov    %edx,%r13d
> + 49 89 f5              mov    %rsi,%r13
>   41 54                 push   %r12
> - 49 89 f4              mov    %rsi,%r12
> + 41 89 d4              mov    %edx,%r12d
>   55                    push   %rbp
> - 44 89 ed              mov    %r13d,%ebp
> + 44 89 e5              mov    %r12d,%ebp
>   53                    push   %rbx
>   48 89 fb              mov    %rdi,%rbx
>   81 e5 cc 00 00 00     and    $0xcc,%ebp
> - 49 29 dc              sub    %rbx,%r12              ; (max - min)
> - 4d 69 e4 e8 03 00 00  imul   $0x3e8,%r12,%r12       ; us --> ns (x 1000)
>   48 83 ec 68           sub    $0x68,%rsp
>   48 c7 44 24 08 b3 8a  movq   $0x41b58ab3,0x8(%rsp)
>   b5 41
> @@ -10721,18 +10719,16 @@
>   31 c0                 xor    %eax,%eax
>   e8 00 00 00 00        call   ...
>   e8 00 00 00 00        call   ...
> - 49 89 c0              mov    %rax,%r8
> - 48 69 c3 e8 03 00 00  imul   $0x3e8,%rbx,%rax       ; us --> ns (x 1000)
> + 48 01 d8              add    %rbx,%rax
> + 48 89 44 24 28        mov    %rax,0x28(%rsp)
>   65 48 8b 1c 25 00 00  mov    %gs:0x0,%rbx
>   00 00
> - 4c 01 c0              add    %r8,%rax
> - 48 89 44 24 28        mov    %rax,0x28(%rsp)
>   e8 00 00 00 00        call   ...
>   31 ff                 xor    %edi,%edi
>   89 ee                 mov    %ebp,%esi
>
> Signed-off-by: Christophe JAILLET <christophe.jaillet@xxxxxxxxxx>
> ---
> v1 -> v2
>   - Simplify and avoid use of __buildint_constant_p() [John Stultz <jstultz@xxxxxxxxxx>]
>   - Also update usleep_idle_range()
>   - Axe usleep_range_state()  [John Stultz <jstultz@xxxxxxxxxx>]
>   - Fix kerneldoc  [John Stultz <jstultz@xxxxxxxxxx>]
>   - Update log message accordingly
> https://lore.kernel.org/all/d7fc85736adee02ce52ee88a54fa7477fbd18ed2.1653236802.git.christophe.jaillet@xxxxxxxxxx/
> ---

Thanks for taking the time to rework this! It looks much better to me!

The only nit I have is you still have a few checkpatch issues to resolve:

WARNING: 'convertion' may be misspelled - perhaps 'conversion'?
#154: FILE: include/linux/delay.h:68:
+        * convertion to ns will be optimized-out at compile time.
           ^^^^^^^^^^

ERROR: code indent should use tabs where possible
#198: FILE: kernel/time/timer.c:2124:
+^I^I^I^I        unsigned int state)$

CHECK: Alignment should match open parenthesis
#198: FILE: kernel/time/timer.c:2124:
+void __sched __nsleep_range_delta_state(u64 min, u64 delta,
+                                       unsigned int state)


(also the path lines in the commit message is confusing checkpatch a bit)

With those resolved, when you resubmit, you can add my:
  Acked-by: John Stultz <jstultz@xxxxxxxxxx>

thanks
-john



[Index of Archives]     [Kernel Development]     [Kernel Announce]     [Kernel Newbies]     [Linux Networking Development]     [Share Photos]     [IDE]     [Security]     [Git]     [Netfilter]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Device Mapper]

  Powered by Linux