Re: optimization of switch statements on i386

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Thanks. I received two answers that suggest that gcc uses a jump table
for consecutive values. I'm unable to reproduce this with gcc4.1.2.
(My problem is that it appears to always use a binary search.)

Example:

gback@top [311](~/tmp) > gcc -v
Using built-in specs.
Target: i386-redhat-linux
Configured with: ../configure --prefix=/usr --mandir=/usr/share/man
--infodir=/usr/share/info --enable-shared --enable-threads=posix
--enable-checking=release --with-system-zlib --enable-__cxa_atexit
--disable-libunwind-exceptions --enable-libgcj-multifile
--enable-languages=c,c++,objc,obj-c++,java,fortran,ada
--enable-java-awt=gtk --disable-dssi --enable-plugin
--with-java-home=/usr/lib/jvm/java-1.4.2-gcj-1.4.2.0/jre
--with-cpu=generic --host=i386-redhat-linux
Thread model: posix
gcc version 4.1.2 20070626 (Red Hat 4.1.2-13)
gback@top [324](~/tmp) > cat switch.c
enum four { A, B, C, D };
extern void g(int);

void f(enum four x)
{
  switch (x)
    {
      case A: g(1); break;
      case B: g(2); break;
      case C: g(3); break;
      case D: g(4); break;
      default: gcc_unreachable();
    }
}
gback@top [325](~/tmp) > gcc -S -O4 switch.c
gback@top [326](~/tmp) > cat switch.s
        .file   "switch.c"
        .text
        .p2align 4,,15
.globl f
        .type   f, @function
f:
        pushl   %ebp
        movl    %esp, %ebp
        movl    8(%ebp), %eax
        cmpl    $1, %eax
        je      .L4
        jb      .L3
        cmpl    $2, %eax
        je      .L5
        cmpl    $3, %eax
        .p2align 4,,7
        je      .L11
        popl    %ebp
        .p2align 4,,6
        jmp     gcc_unreachable
        .p2align 4,,7
.L3:
        movl    $1, 8(%ebp)
        popl    %ebp
        jmp     g
        .p2align 4,,7
.L4:
        movl    $2, 8(%ebp)
        popl    %ebp
        jmp     g
.L11:
        movl    $4, 8(%ebp)
        popl    %ebp
        jmp     g
        .p2align 4,,7
.L5:
        movl    $3, 8(%ebp)
        popl    %ebp
        jmp     g
        .size   f, .-f
        .ident  "GCC: (GNU) 4.1.2 20070626 (Red Hat 4.1.2-13)"
        .section        .note.GNU-stack,"",@progbits

---

which is a binary search, not a jump table (unless I'm missing something.)

Thanks!

 - Godmar

On 08 Feb 2008 18:04:23 -0800, Ian Lance Taylor <iant@xxxxxxxxxx> wrote:
> "Godmar Back" <godmar@xxxxxxxxx> writes:
>
> > I have a question regarding the compilation of switch statements for
> > i386 targets. What heuristics or algorithm does gcc use to decide
> > whether to use a binary search or a jump table? I'm specifically
> > interested in switch statements for a dense range of values and in
> > which the default: branch is unreachable.
>
> Look at expand_case in gcc/stmt.c.
>
> > I've tried a number of approaches in order to obtain a jump table, but
> > it seems gcc 4.x always ends up creating binary searches. For
> > instance, I've tried casting the switch value to a limited range enum
> > and placing a gcc_unreachable() in the default: case. A cursory
> > inspection of the source code also didn't reveal any documentation of
> > the strategy used. Does gcc have reason to believe that a binary
> > search is always faster?
>
> No, it uses a heuristic to choose.  Probably the most relevant one
> here is that if the difference between the maximum and minimum case
> labels is more than 10 * the number of case labels, gcc will use
> comparisons rather than a table.
>
> > If so, is this true for all variants of i386?
>
> I believe that all i386 variants are handled in the same way.
>
> > Would it depend on the number of case arms?
>
> Yes.
>
> Ian
>

[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux