[PATCH v2 1/6] core: add ARM NEON optimized mono-to-stereo/stereo-to-mono remapping code

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Am 06.07.2012 11:29, schrieb Peter Meerwald:
> Hello,
>
>> Can you give a status update: what remains to be done before these
>> patches can be merged?
> I am glad you asked; I am working on a v3 these days to address the
> following issues:
>
> - performance degradation on Cortex-A9 / pandaboard for remap: NEON is
> fast on Cortex-A8 but slow on A9; need to distinguish
> - reimplement using gcc inline assemlby instead of intrinsics (similar to
> the SBC neon code)
> - fix issues in test code


Does it really degrade? Compared to C code? That seems surprising.

I read (on android-ndk) that the speedup through NEON is a lot smaller 
on A9 (60% vs 10% in one scenario), but it's still a speedup.

This is a part of that conversation:

> Yes this is normal when comparing NEON boost on an ARM Cortex-A8 CPU
> compared to an ARM Cortex-A9 CPU, because Cortex-A9 is much more
> advanced than Cortex-A8, so regular C code should run faster on Cortex-
> A9 (eg: Galaxy
> Nexus) compared to Cortex-A8 (eg: Nexus S), but NEON code does not
> necessarily run any faster. So it makes NEON give lower speed boost
> compared to regular C, but it does not mean the code actually runs
> slower

Best regards.


[Index of Archives]     [Linux Audio Users]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux