Re: Compiler optimizing variables in inline assembly

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi,

I haven't read through the code at all, but I will give you a little
general advice.

Try to cut the code to the absolute minimum that shows the problem.  It
makes it easier for you to work with and check, and it makes it easier
for other people to examine.  Also make sure that the code has no other
dependencies such as extra headers - ideally people should be able to
compile the code themselves and test it (I realise this is difficult for
those who don't have an ARM handy).

Code that works without optimisation but fails with optimisation, or
that works when you make a variable volatile, is always a bug.
Occasionally, it is a bug in the compiler - but most often it is a bug
in the code.  Either way, it is important to figure out the root cause,
and not try to hide it by making things volatile (though that might be a
good temporary fix for a compiler bug).

I am not familiar with Neon (and not as good as I should be at ARM
assembly in general), but it looks to me that you have used specific
registers in your inline assembly, and assumed specific registers for
compiler use (such as variables).  Don't do that.  When you have turned
off all optimisation, the compiler is consistent about which registers
it uses for different purposes - when optimising, it changes register
usage in a very unpredictable way.  You must be explicit - all data
going into your assembly must be declared, as must all data coming out
of the assembly.  And if you use specific registers, you need to tell
the compiler about them (as "clobbers") - and be aware that the compiler
might be using those registers for the input or output values.

Getting inline assembly right is not easy, and it is often best to work
with several small assembly statements rather than large ones - I
usually make a "static inline" function around a line or two of inline
assembly and then use that function in the code as needed.  It can make
the result a lot clearer, and makes it easier to mix the C and assembly
- the end result is often better than I would make in pure assembly.

Finally, is there a good reason why you need inline assembly rather than
the neon intrinsics provided by gcc?

<http://gcc.gnu.org/onlinedocs/gcc/ARM-NEON-Intrinsics.html>


mvh.,

David




On 19/02/14 20:04, Cody Rigney wrote:
> Hi,
> 
> I'm trying to add NEON optimizations to OpenCV's LK optical flow.  See
> link below.
> https://github.com/Itseez/opencv/blob/2.4/modules/video/src/lkpyramid.cpp
> 
> The gcc version could vary since this is an open source project, but
> the one I'm currently using is 4.8.1. The target architecture is ARMv7
> w/ NEON. The processor I'm testing on is an ARM
> Cortex-A15(big.LITTLE).
> 
> The problem is, in release mode (where optimizations are set) it does
> not work properly. However, in debug mode, it works fine. I tracked
> down a specific variable(FLT_SCALE) that was being optimized out and
> made it volatile and that part worked fine after that. However, I'm
> still having incorrect behavior from some other optimization.  I'm new
> to inline assembly, so I thought maybe I'm doing something wrong
> that's not telling the compiler that I'm using a certain variable.
> 
> Below is the code at its current state. Ignore all the comments and
> volatiles(for testing this problem) everywhere. It's WIP. I removed
> unnecessary functions and code so it would be easier to see. I think
> the problem is in the bottom-most asm block because if I do if(false)
> to skip it, I don't run into the problem. Thanks.
> 

<snip>






[Index of Archives]     [Linux C Programming]     [Linux Kernel]     [eCos]     [Fedora Development]     [Fedora Announce]     [Autoconf]     [The DWARVES Debugging Tools]     [Yosemite Campsites]     [Yosemite News]     [Linux GCC]

  Powered by Linux