Re : Effects of Clock Resolution on Pulseaudio

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



>>>> So i am not sure what part of Pulseaudio is causing high CPU Utilization
>>>> ..

I can tell you that the fail points are mixing, software volume and resampling.


> Hmm, that function is not optimized in any way, but if I look on its
> sources doesn't appear that slow to me either. For each sample we do
> one multiplication, one shifting, we appy saturation and then we
> increase/decrease poinetrs with wrap around. That shouldn't be that
> bad. Also, this code goes once linearly through all samples, which should
> minimize influence of the cache.

There is also an array lookup of the channel volume (every for loop
cycle), and two increment variables.   With an ARM processor this is
probably enough extra variables to go past the number of registers and
cause stack manipulation.   The easiest things would to be to process
one channel at a time, incrementing your pointer properly and using
the end of the array pointer as a stop point instead of keeping two
count variables.  I also hope you have optimizations turned on in your
compiler or you will get a divide instead of a shift.


It definately is possible to run pulseaudio efficently on an ARM
processor.   Take a look at this for example:
http://developer.garmin.com/linux/nuvi-8xx-series/


I've been working on a modified version of pa_mix for my particular
arm that should be faster.   It basically only works for S16 bits
samples and doesn't do 2 channel volume, but here is a little of it.
You need to modify pa_render to ignore the streams = 1 case and always
use pa_mix, then this is your pa_mix function


size_t pa_mix(
    const pa_mix_info streams[],
    unsigned nstreams,
    void *data,
    size_t length,
    const pa_sample_spec *spec,
    const pa_cvolume *volume,
    int mute) {

    assert(streams && data && length && spec);


#define MAX_STREAMS 8
	uint16_t scale_value[MAX_STREAMS];
	int16_t* buffer_pointer[MAX_STREAMS];
    for(int i = 0; i < nstreams && i < MAX_STREAMS; i++)
    {
        buffer_pointer[i] = (int16_t*) ((uint8_t*)
streams[i].chunk.memblock->data + streams[i].chunk.index);
        if(streams[i].chunk.length < length)
            length = streams[i].chunk.length;
        /**
         * Scale linear software volumes to an exponential curve,
         * approximated here by raising x to the 2nd power
         *
         * We divide by 256 here because the lookup table was generated
         * at that granularity.
         */
        scale_value[i] = (uint16_t)pow2_table[
(int)(streams[i].volume.values[0] / 256) ].v_linear_pow2 ;
    }
	/* fastmix takes samples not bytes */
	length = length / 2;	

	switch(nstreams)
	{
		case 1:
			fast_mix1_overflow(	data, length,
							buffer_pointer[0], scale_value[0]
							);
			break;
		case 2:
			fast_mix2_overflow(	data, length,
							buffer_pointer[0], scale_value[0],
							buffer_pointer[1], scale_value[1]
							);
			break;
		case 3:
			fast_mix3_overflow(	data, length,
							buffer_pointer[0], scale_value[0],
							buffer_pointer[1], scale_value[1],
							buffer_pointer[2], scale_value[2]
							);
			break;
		case 4:
			fast_mix4_overflow(	data, length,
							buffer_pointer[0], scale_value[0],
							buffer_pointer[1], scale_value[1],
							buffer_pointer[2], scale_value[2],
							buffer_pointer[3], scale_value[3]
							);
			break;
		case 5:
			fast_mix5_overflow(	data, length,
							buffer_pointer[0], scale_value[0],
							buffer_pointer[1], scale_value[1],
							buffer_pointer[2], scale_value[2],
							buffer_pointer[3], scale_value[3],
							buffer_pointer[4], scale_value[4]
							);
			break;
		case 6:
			fast_mix6_overflow(	data, length,
							buffer_pointer[0], scale_value[0],
							buffer_pointer[1], scale_value[1],
							buffer_pointer[2], scale_value[2],
							buffer_pointer[3], scale_value[3],
							buffer_pointer[4], scale_value[4],
							buffer_pointer[5], scale_value[5]
							);
			break;
		case 7:
			fast_mix7_overflow(	data, length,
							buffer_pointer[0], scale_value[0],
							buffer_pointer[1], scale_value[1],
							buffer_pointer[2], scale_value[2],
							buffer_pointer[3], scale_value[3],
							buffer_pointer[4], scale_value[4],
							buffer_pointer[5], scale_value[5],
							buffer_pointer[6], scale_value[6]
							);
			break;
		case 8:
			fast_mix8_overflow(	data, length,
							buffer_pointer[0], scale_value[0],
							buffer_pointer[1], scale_value[1],
							buffer_pointer[2], scale_value[2],
							buffer_pointer[3], scale_value[3],
							buffer_pointer[4], scale_value[4],
							buffer_pointer[5], scale_value[5],
							buffer_pointer[6], scale_value[6],
							buffer_pointer[7], scale_value[7]
							);
			break;
		default:
			printf("ERROR!\n");

	
	}
	/* fastmix takes samples not bytes */
	length = length * 2;

	return length;


Then I have these functions for mixing (attached as fastmix.c)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fastmix.c
Type: text/x-csrc
Size: 6049 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/pulseaudio-discuss/attachments/20080730/8b7f39ad/attachment.c>


[Index of Archives]     [Linux Audio Users]     [AMD Graphics]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux