Am 06.07.2012 11:29, schrieb Peter Meerwald: > Hello, > >> Can you give a status update: what remains to be done before these >> patches can be merged? > I am glad you asked; I am working on a v3 these days to address the > following issues: > > - performance degradation on Cortex-A9 / pandaboard for remap: NEON is > fast on Cortex-A8 but slow on A9; need to distinguish > - reimplement using gcc inline assemlby instead of intrinsics (similar to > the SBC neon code) > - fix issues in test code Does it really degrade? Compared to C code? That seems surprising. I read (on android-ndk) that the speedup through NEON is a lot smaller on A9 (60% vs 10% in one scenario), but it's still a speedup. This is a part of that conversation: > Yes this is normal when comparing NEON boost on an ARM Cortex-A8 CPU > compared to an ARM Cortex-A9 CPU, because Cortex-A9 is much more > advanced than Cortex-A8, so regular C code should run faster on Cortex- > A9 (eg: Galaxy > Nexus) compared to Cortex-A8 (eg: Nexus S), but NEON code does not > necessarily run any faster. So it makes NEON give lower speed boost > compared to regular C, but it does not mean the code actually runs > slower Best regards.