Re: [PATCH] x86/fred: Optimize the FRED entry by prioritizing high-probability event dispatching

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/16/2025 8:19 PM, Ethan Zhao wrote:

在 2025/1/17 9:21, H. Peter Anvin 写道:
On 1/16/25 16:37, Ethan Zhao wrote:

hpa suggested to introduce "switch_likely" for this kind of optimization on a switch statement, which is also easier to read.  I measured it with a user space focus test, it does improve performance a lot. But obviously there are still a lot of work to do.

Find a way to instruct compiler to pick the right hot branch meanwhile make folks
reading happy... yup, a lot of work.


It's not that complicated, believe it or not.

/*
 * switch(v) biased for speed in the case v == l
 *
 * Note: gcc is quite sensitive to the exact form of this
 * expression.
 */
#define switch_likely(v,l) \
    switch((__typeof__(v))__builtin_expect((v),(l)))

I tried this macro as following, but got something really *weird* from gcc.

+#define switch_likely(v,l) \
+        switch((__typeof__(v))__builtin_expect((v),(l)))
+
  __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
  {
         unsigned long error_code = regs->orig_ax;
+       unsigned short etype = regs->fred_ss.type & 0xf;

         /* Invalidate orig_ax so that syscall_get_nr() works correctly */
         regs->orig_ax = -1;

-       switch (regs->fred_ss.type) {
+       switch_likely ((etype == EVENT_TYPE_EXTINT || etype == EVENT_TYPE_OTHER), etype) {

Just swap the 2 arguments, and it should be:
+	switch_likely (etype, EVENT_TYPE_OTHER) {


Probably also check __builtin_expect on https://gcc.gnu.org/onlinedocs/gcc/Other-Builtins.html.

         case EVENT_TYPE_EXTINT:
                 return fred_extint(regs);
         case EVENT_TYPE_NMI:
@@ -256,11 +260,12 @@ __visible noinstr void fred_entry_from_user(struct pt_regs *regs)
  __visible noinstr void fred_entry_from_kernel(struct pt_regs *regs)
  {
         unsigned long error_code = regs->orig_ax;
+       unsigned short etype = regs->fred_ss.type & 0xf;

         /* Invalidate orig_ax so that syscall_get_nr() works correctly */
         regs->orig_ax = -1;

-       switch (regs->fred_ss.type) {
+       switch_likely (etype == EVENT_TYPE_EXTINT, etype) {
         case EVENT_TYPE_EXTINT:
                 return fred_extint(regs);
         case EVENT_TYPE_NMI:

Got the asm code as following:

  objdump -d vmlinux.o | awk '/<fred_entry_from_user>:/{c=65} c&&c--'
00000000000015a0 <fred_entry_from_user>:
     15a0:       0f b6 87 a6 00 00 00 movzbl 0xa6(%rdi),%eax
     15a7:       48 8b 77 78 mov    0x78(%rdi),%rsi
     15ab:       55 push   %rbp
     15ac:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
     15b3:       ff
     15b4:       48 89 e5 mov    %rsp,%rbp
     15b7:       66 83 e0 0f and    $0xf,%ax
     15bb:       74 11 je     15ce <fred_entry_from_user+0x2e>
     15bd:       66 83 f8 07 cmp    $0x7,%ax
     15c1:       74 0b je     15ce <fred_entry_from_user+0x2e>
     15c3:       e8 78 fc ff ff callq  1240 <fred_extint>
     15c8:       5d pop    %rbp
     15c9:       e9 00 00 00 00 jmpq   15ce <fred_entry_from_user+0x2e>
     15ce:       e8 4d fd ff ff callq  1320 <fred_bad_type>
     15d3:       5d pop    %rbp
     15d4:       e9 00 00 00 00 jmpq   15d9 <fred_entry_from_user+0x39>
     15d9:       0f 1f 80 00 00 00 00 nopl   0x0(%rax)

00000000000015e0 <__pfx_fred_entry_from_kernel>:
     15e0:       90                      nop
     15e1:       90                      nop

00000000000015f0 <fred_entry_from_kernel>:
     15f0:       55 push   %rbp
     15f1:       48 8b 77 78 mov    0x78(%rdi),%rsi
     15f5:       48 c7 47 78 ff ff ff movq   $0xffffffffffffffff,0x78(%rdi)
     15fc:       ff
     15fd:       48 89 e5 mov    %rsp,%rbp
     1600:       f6 87 a6 00 00 00 0f testb  $0xf,0xa6(%rdi)
     1607:       75 0b jne    1614 <fred_entry_from_kernel+0x24>
     1609:       e8 12 fd ff ff callq  1320 <fred_bad_type>
     160e:       5d pop    %rbp
     160f:       e9 00 00 00 00 jmpq   1614 <fred_entry_from_kernel+0x24>
     1614:       e8 27 fc ff ff callq  1240 <fred_extint>
     1619:       5d pop    %rbp
     161a:       e9 00 00 00 00 jmpq   161f <fred_entry_from_kernel+0x2f>
     161f:       90                      nop

0000000000001620 <__pfx___fred_entry_from_kvm>:
     1620:       90                      nop
     1621:       90                      nop


Even the fred_entry_from_kernel() asm code doesn't look right.
*gcc version 8.5.0 20210514 (Red Hat 8.5.0-10) (GCC)*
**
*Did I screw up something ?*
**
*Thanks,*
*Ethan*

    -hpa







[Index of Archives]     [Linux Kernel]     [Kernel Development Newbies]     [Linux USB Devel]     [Video for Linux]     [Linux Audio Users]     [Yosemite Hiking]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux