From: Christophe JAILLET > Sent: 29 July 2022 21:29 > > Most of the time the 'min' and 'max' parameters of usleep_range() are > constant. We can take advantage of it to pre-compute at compile time > some values otherwise computer at run-time in usleep_range_state(). > > Replace usleep_range_state() by a new __nsleep_range_delta_state() function > that takes as parameters the pre-computed values. > > The main benefit is to save a few instructions, especially 2 > multiplications (x1000 when converting us to ns). ... > 53 push %rbx > 48 89 fb mov %rdi,%rbx > 81 e5 cc 00 00 00 and $0xcc,%ebp > - 49 29 dc sub %rbx,%r12 ; (max - min) > - 4d 69 e4 e8 03 00 00 imul $0x3e8,%r12,%r12 ; us --> ns (x 1000) > 48 83 ec 68 sub $0x68,%rsp > 48 c7 44 24 08 b3 8a movq $0x41b58ab3,0x8(%rsp) > b5 41 > @@ -10721,18 +10719,16 @@ > 31 c0 xor %eax,%eax > e8 00 00 00 00 call ... > e8 00 00 00 00 call ... > - 49 89 c0 mov %rax,%r8 > - 48 69 c3 e8 03 00 00 imul $0x3e8,%rbx,%rax ; us --> ns (x 1000) > + 48 01 d8 add %rbx,%rax > + 48 89 44 24 28 mov %rax,0x28(%rsp) > 65 48 8b 1c 25 00 00 mov %gs:0x0,%rbx > 00 00 > - 4c 01 c0 add %r8,%rax > - 48 89 44 24 28 mov %rax,0x28(%rsp) > e8 00 00 00 00 call ... ... Is that really measurable in any test? Integer multiply is one clock on almost every modern cpu. By the time you've allowed for superscaler cpu there is probably no difference at all on anything except the simplest cpus. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)