Re: [PATCH v3] Add iwmmxt optimization for sbc for pxa series cpu

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Monday 15 November 2010 04:46:25 Keith Mok wrote:
> > I sometimes use different indentation levels in such cases in order to
> > improve readability after instructions reordering, so that each
> > logically independent block of code has its own indentation level and it
> > is still easily visible
> 
> > after instructions reordering. For example, with the original code:
> Thanks for the hints. I rearranged the code.

Thanks, now the assembly code looks ok to me. I also discovered that qemu
supports iwmmxt1 emulation just fine and also tried to test your optimizations
for correctness myself (with a script which tries different encoding paramaters 
for different audio samples and checks md5 checksums), no problems detected.

So if somebody else could check whether the other things are right (copyright
notices for example), then we are done with it.

> I removed the scale_factor optimization since from the result I
> tested, it shows little help in performance.

I guess after easily doubling performance by adding simd optimizations to the
sbc analysis filter, just roughly ~10% improvement (as measured for x86 and
arm neon) does not look particularly impressive anymore: 
http://git.kernel.org/?p=bluetooth/bluez.git;a=commit;h=95465b816f0ce7f0ec10a183ce7ff0c6f83d86eb
http://git.kernel.org/?p=bluetooth/bluez.git;a=commit;h=d049a9a2aec2b518e04f11ef0ecc355db8237291

But I still think that every little bit helps. Did you also get something like 
10% speedup, or was it even worse than that?

A bit more important in practice is the optimization for joint stereo scale 
factors calculation (because it is typically used for A2DP). And it provided
almost 20% of performance improvement for arm neon:
http://git.kernel.org/?p=bluetooth/bluez.git;a=commit;h=e1ea3e76c72d56041c30b317818e8d7b5a0c7350

So 'sbc_calc_scalefactors_j_iwmmxt' may be a nice addition too, optimized 
either as a whole for best performance (like in arm neon code), or just with
some small chunks of assembly like in 'sbc_calc_scalefactors_mmx' because it
is easier this way.

-- 
Best regards,
Siarhei Siamashka

Attachment: signature.asc
Description: This is a digitally signed message part.


[Index of Archives]     [Bluez Devel]     [Linux Wireless Networking]     [Linux Wireless Personal Area Networking]     [Linux ATH6KL]     [Linux USB Devel]     [Linux Media Drivers]     [Linux Audio Users]     [Linux Kernel]     [Linux SCSI]     [Big List of Linux Books]

  Powered by Linux