Yesterday afternoon I tried to look a bit on where PulseAudio spends its time. In a low-latency scenario PulseAudio seems to be a bit CPU intensive. The tests were done with perf analyzing PulseAudio git master + a 3.2 kernel, and an AMD CPU that's ~3 years old or so. After starting PulseAudio I ran a 10 ms playback latency test during 30 seconds. I'm still learning to master perf, but here are some initial results: 1) The top four functions (and more than half of the top 20) are all in the kernel, and a result from the ppoll call. It seems like we do about 10 ppolls for every packet: three in the I/O thread and seven in the main thread. The majority of the main thread ppolls are due to the iochannel, i e talking to the client. It looks like the iochannel is a bit inoptimal in its construction, requiring more ppolls than should be necessary. But I have to analyze this deeper. (I did a simple attempt to optimise it yesterday but it didn't work.) 2) Second is all the atomic operations we do, in handling our flists. There seem to be quite a penalty for doing memory fence operations, and there are plenty of those for just getting/putting something into an flist. Here it might also be possible to optimise things, maybe we can use a more efficient memory recycle mechanism for the most common workload or so. 3) Third is asking the kernel for the current hardware pointer. I have not looked into this deeper yet, but it's possible that we ask several times for the current hardware position where we should ask only once. -- David Henningsson, Canonical Ltd. https://launchpad.net/~diwic -------------- next part -------------- + 2.55% lt-pulseaudio [kernel.kallsyms] [k] do_poll.isra.4 ? + 2.34% lt-pulseaudio [kernel.kallsyms] [k] fget_light ? + 2.14% lt-pulseaudio [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore ? + 2.07% lt-pulseaudio [kernel.kallsyms] [k] _raw_spin_lock_irqsave ? + 1.85% lt-pulseaudio [snd_hda_intel] [k] 0xc24 ? + 1.55% lt-pulseaudio [kernel.kallsyms] [k] find_busiest_group ? + 1.53% lt-pulseaudio [kernel.kallsyms] [k] tg_load_down ? + 1.43% lt-pulseaudio libpulsecommon-3.0.so [.] stack_pop ? + 1.37% lt-pulseaudio libpulsecommon-3.0.so [.] pa_memblock_unref ? + 1.11% lt-pulseaudio libalsa-util.so [.] thread_func ? + 1.06% lt-pulseaudio libpulsecommon-3.0.so [.] stack_push ? + 1.05% lt-pulseaudio [kernel.kallsyms] [k] unix_poll ? + 0.93% lt-pulseaudio [vdso] [.] 0x7fff2c1768c5 ? + 0.89% lt-pulseaudio ld-2.15.so [.] check_match.11236 ? + 0.84% lt-pulseaudio libpulse.so.0.15.3 [.] pa_mainloop_dispatch ? + 0.80% lt-pulseaudio [kernel.kallsyms] [k] __ticket_spin_lock ? + 0.79% lt-pulseaudio [kernel.kallsyms] [k] ret_from_sys_call ? + 0.72% lt-pulseaudio [kernel.kallsyms] [k] do_sys_poll ? + 0.68% lt-pulseaudio [kernel.kallsyms] [k] fput ? + 0.68% lt-pulseaudio [kernel.kallsyms] [k] system_call ? + 0.68% lt-pulseaudio [kernel.kallsyms] [k] update_cfs_load ? + 0.66% lt-pulseaudio [kernel.kallsyms] [k] __schedule ? + 0.66% lt-pulseaudio libpulsecommon-3.0.so [.] do_something ? + 0.65% lt-pulseaudio [kernel.kallsyms] [k] fsnotify ? + 0.62% lt-pulseaudio libpulse.so.0.15.3 [.] pa_mainloop_prepare ? + 0.62% lt-pulseaudio [kernel.kallsyms] [k] sysret_check ? + 0.60% lt-pulseaudio [kernel.kallsyms] [k] __pollwait ? + 0.58% lt-pulseaudio [kernel.kallsyms] [k] copy_user_generic_string ? + 0.57% lt-pulseaudio libpulsecommon-3.0.so [.] pa_flist_pop ? + 0.55% lt-pulseaudio [kernel.kallsyms] [k] eventfd_poll ? + 0.53% lt-pulseaudio [kernel.kallsyms] [k] native_write_msr_safe ? + 0.52% lt-pulseaudio [kernel.kallsyms] [k] find_next_bit ? + 0.51% lt-pulseaudio libpulsecore-3.0.so [.] pa_rtpoll_run ?