On Fri, Sep 24, 2010 at 02:13:26PM +0800, Wu Fengguang wrote: > On Fri, Sep 24, 2010 at 01:17:29PM +0800, Wu Fengguang wrote: > > Greetings, > > > > We are debugging a new QEMU USB audio model that works fine with alsa > > backend but drops samples from time to time with pulseaudio backend. > > > > Traces show that the pa_simple_write() call normally does its job > > quickly, but sometimes takes ~60ms to return (the last write=XXms > > field): > > > > L1 qpa_thread_out rpos = 1499, pa->decr = 1155, decr = 1155, pa->live = 0, write=0ms > > L2 qpa_thread_out rpos = 605, pa->decr = 1154, decr = 1154, pa->live = 0, write=0ms > > L3 qpa_thread_out rpos = 1670, pa->decr = 1065, decr = 1065, pa->live = 0, write=1ms > > L4 qpa_thread_out rpos = 736, pa->decr = 1114, decr = 1114, pa->live = 0, write=0ms > > L5 qpa_thread_out rpos = 1850, pa->decr = 1114, decr = 1114, pa->live = 0, write=0ms > > L6 qpa_thread_out rpos = 957, pa->decr = 1155, decr = 1155, pa->live = 0, write=0ms > > L7 qpa_thread_out rpos = 15, pa->decr = 1106, decr = 1106, pa->live = 0, write=0ms > > ==> L8 qpa_thread_out rpos = 1127, pa->decr = 1112, decr = 1112, pa->live = 936, write=61ms > > ==> L9 qpa_thread_out rpos = 1127, pa->decr = 2048, decr = 2048, pa->live = 0, write=0ms > > > > The 61ms delay in L8 overruns the ring buffer inside qemu (note the > > decr=2048 in L9; I ran qemu with QEMU_PA_SAMPLES=2048). > > > > Skimming through the pulseaudio code, my wild guess is there are two > > possible source of delays (I excluded kernel problems since ALSA works > > fine): > > 1) memory allocation > > 2) pulseaudio internal lock > > 3) blockable syscall > > With "strace -Tp `pidof pulseaudio`", I do see several ppoll() > syscalls that take up to 59ms: > > (sorted by time spent in the syscall) > > ==> 0.059522 ppoll > ==> 0.059478 ppoll > ==> 0.056892 ppoll > 0.035356 ppoll > 0.028646 ppoll > 0.028407 ppoll > 0.027576 ppoll > ... This is the POLL that takes 59ms: ppoll([{fd=4, events=POLLIN}, {fd=7, events=POLLIN}, {fd=21, events=POLLIN}, {fd=27, events=POLLIN}, {fd=36, events=POLLIN}, {fd=35, events=POLLIN}, {fd=34, events=POLLIN}, {fd=33, events=POLLIN}, {fd=10, events=POLLIN}, {fd=32, events=POLLIN|POLLERR|POLLHUP}, {fd=32, events=0}, {fd=31, events=POLLIN}, {fd=30, events=POLLIN}, {fd=23, events=POLLIN}, {fd=26, events=POLLIN}, {fd=17, events=POLLIN}, {fd=20, events=POLLIN}, {fd=16, events=POLLIN|POLLERR|POLLHUP}, {fd=16, events=0}, {fd=15, events=POLLIN}, {fd=14, events=POLLIN}, {fd=6, events=POLLIN}], 22, NULL, NULL, 8) = 1 ([{fd=20, revents=POLLIN}]) <0.059522> There are no timeout in the POLL. So it will wakeup until "avail_min" is exceeded, which is a pretty high number in pulseaudio: $ cat /proc/asound/card0/pcm0p/sub0/sw_params tstamp_mode: ENABLE period_step: 1 ==> avail_min: 15503 start_threshold: 18446744073709551615 stop_threshold: 4611686018427387904 silence_threshold: 0 silence_size: 0 boundary: 4611686018427387904 "period_size" is also large: $ cat /proc/asound/card0/pcm0p/sub0/hw_params access: MMAP_INTERLEAVED format: S16_LE subformat: STD channels: 2 rate: 44100 (44100/1) ==> period_size: 8192 buffer_size: 16384 For normal alsa playback, their typical values are 1024. Pulseaudio sets them to large values and maybe is not entirely depending on the poll. It may be setting some timer to wake up itself when the DMA buffer runs low on data. Anyway the pulseaudio's large buffer scheme makes possible the 61ms pa_simple_write(). Qemu is not expecting such a large latency. Its QEMU_PA_SAMPLES defaults to 1024, which can only hold up to 1024/48=21ms samples. So the 61ms delay will obviously overflow the "data push" based USB audio model.. Thanks, Fengguang