Le vendredi 10 f?vrier 2012 19:13:38 Peter Meerwald, vous avez ?crit : > > "-mfpu=neon", then the compiler assumes the code will run ONLY on > > NEON-capable ARM devices. If you want to do run-time detection, you MUST > > NOT pass the corresponding compiler flag. (The same is true of MMX and > > SSE by the way.) > > so the simple solution is to > - drop the runtime check > - use NEON if the compiler provides NEON > > if the code has to run on non-NEON platforms, NEON support cannot be > enabled in the compiler Correct. > the more involved solution is to > - have the runtime check in place > - compile code with different compiler flags > - make a decision at runtime and call different code path Right. It's much easier for dedicated assembler source code that inline assembler or intrinsics, as you can simply override the FPU in the source: .fpu neon In GCC, the "target" function attribute would address this problem nicely, but it is not supported on ARM -only x86- at the moment, to my knowledge :-( > in PulseAudio, the MMX/SSE code path use inline assembler; surprisingly > (for me at least), gcc happily compiles inline assembler SSE/MMX code even > with -march=i386, i.e. arch=i386 does not get passed to the assembler I think that is a grand-fathered bug in x86 GCC. But even then, the compiler will reject MMX or SSE registers in the clobber list of the inline assembler. Without a valid clobber list, you cannot safely write inline assembler. As long as the MMX and SSE registers are not used for anything else in the same thread, it works. Then one day, someone compiles the software with SSE for FPU computations or whatever, and it explodes due to registers corruption. So from my point of view, run-time MMX and SSE selection is just as hard as ARM NEON's. The only extra nicety is GCC 4.4 per-function target attribute: __attribute__((__target__("mmx"))) __attribute__((__target__("sse"))) which enable MMX or SSE on a per-function basis. Then you can include the mm or xmm registers in the clobber list. So there's no need to fiddle with compiler flags. > PulseAudio simply assumes that the compiler is recent enough to know about > MMX/SSE, there is no compile-time probing or checks such as #ifdef > __SSE__ (fair enough) That's probably wrong. In VLC, we have had cases of corrupted builds depending on the compiler flags, while doing making that assumption. > to take this solution, some build infrastructure is needed; it might be > required as well for the SSE3 resampler patches in discussion > > this means: > - probe compiler flags (such as -msse2, -mfpu=neon) > - probably configure options to override > - passing different compiler flags to different compilation units > > which route shall we go? -- R?mi Denis-Courmont http://www.remlab.net/ http://fi.linkedin.com/in/remidenis