Re: x86 gcc lacks simple optimization

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Dec 6, 2013 at 11:19 AM, Konstantin Vladimirov
<konstantin.vladimirov@xxxxxxxxx> wrote:
> Hi,
>
> nothing changes if everything is unsigned and we are guaranteed to not
> raise UB on overflow:
>
> unsigned foo(unsigned char *t, unsigned char *v, unsigned w)
> {
> unsigned i;
>
> for (i = 1; i != w; ++i)
> {
> unsigned x = i << 2;
> v[x + 4] = t[x + 4];
> }
>
> return 0;
> }
>
> yields:
>
> .L5:
> leal 0(,%eax,4), %edx
> addl $1, %eax
> movzbl 4(%edi,%edx), %ecx
> cmpl %ebx, %eax
> movb %cl, 4(%esi,%edx)
> jne .L5
>
> What is SCEV infrastructure (guessing scalar evolutions?) and what
> files/passes to look in?

tree-scalar-evolution.c, look at where it handles MULT_EXPR but
lacks LSHIFT_EXPR support.

Richard.

> ---
> With best regards, Konstantin
>
> On Fri, Dec 6, 2013 at 2:10 PM, Richard Biener
> <richard.guenther@xxxxxxxxx> wrote:
>> On Fri, Dec 6, 2013 at 9:30 AM, Konstantin Vladimirov
>> <konstantin.vladimirov@xxxxxxxxx> wrote:
>>> Hi,
>>>
>>> Consider code:
>>>
>>> int foo(char *t, char *v, int w)
>>> {
>>> int i;
>>>
>>> for (i = 1; i != w; ++i)
>>> {
>>> int x = i << 2;
>>> v[x + 4] = t[x + 4];
>>> }
>>>
>>> return 0;
>>> }
>>>
>>> Compile it to x86 (I used both gcc 4.7.2 and gcc 4.8.1) with options:
>>>
>>> gcc -O2 -m32 -S test.c
>>>
>>> You will see loop, formed like:
>>>
>>> .L5:
>>> leal 0(,%eax,4), %edx
>>> addl $1, %eax
>>> movzbl 4(%edi,%edx), %ecx
>>> cmpl %ebx, %eax
>>> movb %cl, 4(%esi,%edx)
>>> jne .L5
>>>
>>> But it can be easily simplified to something like this:
>>>
>>> .L5:
>>> addl $1, %eax
>>> movzbl (%esi,%eax,4), %edx
>>> cmpl %ecx, %eax
>>> movb %dl, (%ebx,%eax,4)
>>> jne .L5
>>>
>>> (i.e. left shift may be moved to address).
>>>
>>> First question to gcc-help maillist. May be there are some options,
>>> that I've missed, and there IS a way to explain gcc my intention to do
>>> this?
>>>
>>> And second question to gcc developers mail list. I am working on
>>> private backend and want to add this optimization to my backend. What
>>> do you advise me to do -- custom gimple pass, or rtl pass, or modify
>>> some existent pass, etc?
>>
>> This looks like a deficiency in induction variable optimization.  Note
>> that i << 2 may overflow and this overflow does not invoke undefined
>> behavior but is in the implementation defined behavior category.
>>
>> The issue in this case is likely that the SCEV infrastructure does not handle
>> left-shifts.
>>
>> Richard.
>>
>>> ---
>>> With best regards, Konstantin




[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux