On 11/4/08, Liam Vrilehen <vrilehen@xxxxxxxxxxxxxx> wrote: > Saturated integer addition in C > > I'm writing a C application that contains considerable amount of > saturated integer math. Beeing curious I took a glance on the > assembler gcc-4.3.1 generates for x86. > > The problem seems to be that I haven't found any way of coding > that would result in overflow/carry dependent jumps or MMX/SSE2 > (MOVD -> PADDD -> MOVD). > > Any hints/ideas are welcome on how to write protable C code > for saturated integer math. (Inline assembler isn't an option.) > > Below the C code, generated assembler and hand coded assembler > for saturated addition of two 32 bit signed integers. > > The best C for gcc I've come up with so far: > | #include <limits.h> > | > | int sum_saturated_gcc(int a, int b){ > | int r = a + b; This statement generates a signed integer overflow when saturation would occur, and hence is not portable. You need to pre-test the arguments. E.g. if ( b > 0 && MAX_INT - b < a ) r = INT_MAX; else .... > | > | if(0 <= r) > | if((a < 0) && (b < 0)) > | r = INT_MIN; > | else if ((0 < a) && (0 < b)) > | r = INT_MAX; > | return r; > | } > > The generated assembler (-m32 -O3 -mmmx -msse -msse2 -msse3): > | .file "a.c" > | .text > | .p2align 4,,15 > | .globl sum_saturated_gcc > | .type sum_saturated_gcc, @function > | sum_saturated_gcc: > | pushl %ebp > | movl %esp, %ebp > | movl 12(%ebp), %edx > | addl 8(%ebp), %edx > | jns .L6 > | .L2: > | movl %edx, %eax > | popl %ebp > | ret > | .p2align 4,,7 > | .p2align 3 > | .L6: > | movl 8(%ebp), %eax > | testl %eax, %eax > | js .L7 > | .L3: > | movl 8(%ebp), %ecx > | testl %ecx, %ecx > | jle .L2 > | movl 12(%ebp), %eax > | testl %eax, %eax > | movl $2147483647, %eax > | cmovg %eax, %edx > | jmp .L2 > | .p2align 4,,7 > | .p2align 3 > | .L7: > | movl 12(%ebp), %eax > | testl %eax, %eax > | jns .L3 > | movl $-2147483648, %edx > | jmp .L2 > | .size sum_saturated_gcc, .-sum_saturated_gcc > | .ident "GCC: (Gentoo 4.3.1-r1 p1.1) 4.3.1" > | .section .note.GNU-stack,"",@progbits > > > Hand coded assembler: plain x86, no MMX/SSE2 (MOVD -> PADDD -> MOVD) > | .text > | .globl sum_saturated > | .type sum_saturated, @function > | sum_saturated: > | # prolog > | pushl %ebp > | movl %esp, %ebp > | > | # add > | movl 8(%ebp), %eax > | addl 12(%ebp), %eax > | > | # check overflow > | jno .L_done > | > | # check type of overflow > | js .L_max > | > | .L_min: > | # overflow to positive > | movl $-2147483648, %eax > | jmp .L_done > | > | .L_max: > | # overflow to negative > | movl $2147483647, %eax > | > | .L_done: > | # epilog > | popl %ebp > | ret > | .size sum_saturated, .-sum_saturated -- Lawrence Crowl