Re: how to generate code-loops

David Livshin <david.livshin@xxxxxxxxxxx> · Sat, 30 Sep 2006 21:38:54 +0300

Brian Budge wrote:
Hi David -

a)  It looks like a loop to me.
No it is not a loop, at least not a loop I would like to have. Perhaps I 
had to mention this in my message, but the boy of the "loop" should be a 
basic block ( no jump to/from it ) - the code generated is not a such.
b)  You instructed the compiler to unroll all loops, which is
apparently what it did.

Maybe I'm confused about what you are asking?

 Brian

On 9/30/06, David Livshin <david.livshin@xxxxxxxxxxx> wrote:

Hi,

Is there a way ( command line option ) to instruct the compiler to
generate code-loops ( when possible ):

Label:
    <code>
    conditional jump to 'Label'

Compiling

for ( k=0 ; k<n ; k++ )
{
    x[k] = y[k+1] - y[k];
}

with the command line options

" -O3 -fomit-frame-pointer -funroll-all-loops -ffast-math
-march=pentium4 -mfpmath=sse -msse2 -mmmx"

produces

.L1064:
    movsd as1+32032(,%edx,8), %xmm1
    subsd as1+32024(,%edx,8), %xmm1
    movsd %xmm1, as1+24016(,%edx,8)
    cmpl %ecx, %edx
    je .L1010
.L1012:
    leal 1(%edx), %esi
    movsd as1+32032(,%esi,8), %xmm0
    subsd as1+32024(,%esi,8), %xmm0
    movsd %xmm0, as1+24016(,%esi,8)
    leal 2(%edx), %ebx
    movsd as1+32032(,%ebx,8), %xmm7
    subsd as1+32024(,%ebx,8), %xmm7
    movsd %xmm7, as1+24016(,%ebx,8)
    leal 3(%edx), %eax
    movsd as1+32032(,%eax,8), %xmm6
    subsd as1+32024(,%eax,8), %xmm6
    movsd %xmm6, as1+24016(,%eax,8)
    leal 4(%edx), %esi
    movsd as1+32032(,%esi,8), %xmm5
    subsd as1+32024(,%esi,8), %xmm5
    movsd %xmm5, as1+24016(,%esi,8)
    leal 5(%edx), %ebx
    movsd as1+32032(,%ebx,8), %xmm4
    subsd as1+32024(,%ebx,8), %xmm4
    movsd %xmm4, as1+24016(,%ebx,8)
    leal 6(%edx), %eax
    movsd as1+32032(,%eax,8), %xmm3
    subsd as1+32024(,%eax,8), %xmm3
    movsd %xmm3, as1+24016(,%eax,8)
    leal 7(%edx), %esi
    movsd as1+32032(,%esi,8), %xmm2
    subsd as1+32024(,%esi,8), %xmm2
    movsd %xmm2, as1+24016(,%esi,8)
    addl $8, %edx
    jmp .L1064
.L1010:

which is not a loop; it is difficult to process ( and also runs slower
then when the code is a loop ).

Thank you in advance,

David

--
David Livshin
david.livshin@xxxxxxxxxxx

http://www.dalsoft.com

--
David Livshin
david.livshin@xxxxxxxxxxx
tel:    +972-8-684-6104
mobile: +972-54-729-0998

http://www.dalsoft.com