2014-02-07 10:17 GMT+06:00 Arun Raghavan <arun at accosted.net>: > Hello, > This year's call for projects participation is out, and I'd like to > gauge interest in participation. I'm happy to run org admin duty > again, and if you've got ideas for a project and/or would like to > mentor a student, please drop your name on the wiki: > > http://www.freedesktop.org/wiki/Software/PulseAudio/Software/PulseAudio/GSoC2014/ > > We should decide one way or the other by mid next week so that we can > get our org application in well in time if we're doing this. Hello. The following is mostly a copy-paste from the ideas that I have already sent to the list or privately to people, plus a direct translation of some features provided by hardware. The text, of course, needs to be improved before the final sumbission to Google. I would be happy to review any related code. 1. Tool for objective automated noninteractive evaluation of the percieved resampler quality. Problem statement: in commit 92bb9fb8b5aeebb87c4df7416e75db1782e2dd3a, the default resampler quality has been changed without any objective arguments about the impact on the percieved sound quality. And there is no tool to make such objective arguments, although there is enough science to create it. It should be created. The task, as I see it, is to: a) implement a well-respected published psychoacoustical model, or take an existing one; b) quantify distortions (noise from rounding errors on intermediate results, unwanted aliased frequency content, attenuated high frequencies) introduced by the existing resamplers - i.e. write a program that, given a sound file and the target sample rate, produces the dB level of the distortion introduced by a given resampler in each time interval at each frequency bin; bonus points for doing the same for Windows and Mac OS X built-in resamplers (definitely doable by capturing their impulse response through KVM; I did this for Windows before writing the Wine resampler, but did not link it to any psychoacoustical model); c) given a variety of reql-world sound material (music of different genres, soundtracks, talks) and a psychoacoustical model, calculate the dB level of distortion that can be introduced in each time interval in each frequency bin without the average human noticing this; d) compare the results from (b) and (c), make one of the conclusions: "overkill", "just right", "introduces noticeable distortion in this frequency band, here is the problematic sample". <off-topic>I am quite surprised that there was no "audiophile" discussion on the list or elsewhere, especially since the old default filter length closely matched what Windows XP does by default (I can state that as an author of the resampler used in Wine). But I can't make any statements about whether the new default is good enough without the mentioned tool.</off-topic> Contacts: Alexander E. Patrakov Necessary background: digital sound processing, access to scientific papers on the topic, python with numpy and scipy, or any other mathematical toolbox. If I were to do this, numpy/scipy would be my toolbox of choice. 2. Rewind-friendly resampler. Problem statement: As of now, when rewinding a sink input, PulseAudio resets the resampler. This is wrong and leads to audible clicks, but this is a necessary evil because none of the resampler libraries used by PulseAudio has a rewind-compatible API (i.e. the existing APIs don't allow to say "forget the last 1000 input samples, tell me how many output samples should be forgotten due to that"). A new resampler has to be written or an existing one improved to such a degree that calling pa_stream_write() with the last two parameters other than 0,0 and overwriting the previously-written samples with themselves does not introduce clicks. Just as well, if a sink processes a rewind for internal reasons, there should be no clicks. Contacts: Alexander E. Patrakov Necessary background: digital sound processing, C Note: a similar problem exists with virtual sink modules: module-equalizer-sink and module-virtual-surround-sink. However, the next two proposals invalidate a "fix virtual sinks" would-be-proposal, as after them only essentially-realtime effects and module-ladspa-sink remain. 3. Equalizer in pavucontrol (very questionable, see below) Problem statement: As of now, the only graphical frontend to module-equalizer-sink is qpaeq (PyQT4-based). A GTK-based based frontend should be written and included into pavucontrol. Contacts: Colin Guthrie? Necessary background: C, GTK+, D-Bus And here is why I think this is questionable. First, look at module-equalizer-sink code. The impression is that it has been accepted without any review. It just prenends to "work". E.g. a buffer is allocated with fftwf_malloc() and freed with free() instead of fftw_free(). The code is also wrong from the DSP viewpoint - e.g. it does nothing to ensure that the impulse response is shorter than the FFT size minus the window size, thus failing time invariance. If the sink is used at the 16000 Hz sampling rate or less, there is a buffer overflow due to inconsistent choice of the FFT length and the window size. The algorithmic latency is fixed at 15999 samples, which is way too much. The module does not use any benefits (e.g. the chance to handle rewinds properly) of being a native PulseAudio module and not a LADSPA plugin. Veromix (an advanced mixer application for PulseAudio) already uses module-ladspa-sink instead of this, maybe due to the unified D-BUS API provided by module-ladspa-sink that allows veromix to control other LADSPA plugins as well. If I were you, I would have deleted the module right now instead of proposing this GSoC project. But then, "implement an equalizer in pavucontrol, using module-ladspa-sink as a backend" would be valid. 4. Channel remixer improvements. (needs splitting, see even more ideas in a big comment in resampler.c) Problem statement: currently, PulseAudio has a remixer in its core that only produces instantaneous linear combinations of the input channels, and also module-virtual-surround-sink, that, given a wav file with head-related impulse responses, downmixes 5.1 to stereo while preserving spatial information. These two remixers have a bad interaction between themselves and with profile switches, see below, and this looks ugly-to-fix in the virtual-sink model. The goal is to introduce advanced upmixing and downmixing techniques into PulseAudio core. Bad interaction: suppose that one plays a 4.0 track through module-virtual-surround-sink. Module-virtual-surround-sink is a sink, so PulseAudio applies its usual remixing to all input streams using its core. Thus, module-virtual-surround-sink sees not the original 4.0 content that needs to be downmixed, but fake 5.1 content corrupted by synthesizing the fake center and LFE channels. PA_RESAMPLER_NO_REMIX would help here, but introduces another problem: with normalization. Again, there is no way to distinguish this from a 5.1 stream with a silent center channel. Ideally, for safety, the overall filter gain in the HRIR-aware downmixer should be such that there is no clipping even if all input channels are active - and that gain is different for 5.1 and 4.0 cases, just because of the different number of channels. The core remixer erases this information. Now consider that this listener unplugs headphones. Now the sound should go to his 5.1 audio system, but instead continues to play on this downmixer sink and gets further upmixed by the core. That's clearly wrong. The same conclusion about profile interaction can be reached by considering module-equalizer-sink. It does not switch its number of channels even if its master sink does that. As a result, 2.0 -> 5.1 -> 2.0 yo-yo is entirely possible (and of course unwanted). Writing a fancy upmixer based on reverse-engineered Dolby Pro Logic or on scientific papers is also within the scope of this project. Current status: I have a rewritten (and rewind-friendly!) virtual sink module sitting on my laptop that applies arbitrary IIR filters. I will send it after cleanup of scripts that generate the filter coefficients. This is already good enough to provide LFE channel extraction, to replace the virtual surround sink and even to provide virtual surround effect on my laptop speakers, but of course not good enough to solve the profile-related problem. Contact: Alexander E. Patrakov Necessary background: digital sound processing, C Possible split: * integrate multichannel-to-binaural HRIR-based downmixer into core (possibly after I publish the IIR sink) (and maybe allow its use even for stereo streams, to narrow them down if the user wants it) * integrate binaural-to-stereo remixer into core (when I publish the IIR sink, or based on the published ambiophonics research) * integrate LFE extraction into core (when I publish the IIR sink, or independently) * write and integrate a fancy stereo-to-5.1 upmixer based on published research * integrate heuristics to apply and unapply the above effects appropriately 5. Per-channel delay (probably too simple) Problem statement: some high-end audio receivers (e.g. Onkyo TX-NR626) have an option to introduce a separately-configurable delay in each channel. This is needed, e.g., if due to the room geometry constraints the speakers are not equidistant from the listener. This happens, e.g., with the front-center channel if one places all three front speakers near the wall - in this case, the front-center signal needs to be slightly delayed WRT front-left and front-right in order to arrive to the listener at precisely the same moment of time. It would be nice to emulate this feature in PulseAudio for the benefit of users with cheap 5.1 analog speakers, and provide a GUI for it. Contact: Alexander E. Patrakov (?) Necessary background: C, GTK+. 6. Digital Room Correction for PulseAudio Problem statement: some high-end audio receivers (e.g. Onkyo TX-NR626) do not even have a graphical equalizer! Instead, they come with a calibrated microphone and a digital room correction feature in the firmware. They play a known test sound through each speaker, record what the microphone hears, and thus learn about the room acoustics. Then they apply this knowledge to equalize the played-back sound. This feature should be available for users of analog speakers, too, via PulseAudio. In fact, there already exists a free implementation of Digital Room Correction: http://drc-fir.sourceforge.net/ , one just needs to write a FIR convolution engine for PulseAudio and a GUI for calibration. And also to think how to work around the fact that a calibrated microphone is not always available - luckily there are some readily-available "calibrated" sound sources like popping bubble wrap. Contact: Alexander E. Patrakov Necessary background: C, digital sound processing, a calibrated microphone. 7. Intra-application sound mixing (needs discussion, may be a social problem after all) Some time ago, I added a documentation patch (with some improvements from Tanu) about known misuse of PulseAudio API. As a part of that patch, I made a far-fetched but IMHO true statement that sometimes it is a responsibility of the application itself to mix its own streams (as it is done in Wine) or to attenuate samples. However, I am afraid that this will be percieved as a documentation of a PulseAudio bug (inability to mix individual application streams without polluting the mixer GUIs with extra sliders) that just shifts the responsibility and extra work to individual developers. Also, this documentation is not read by developers that use PulseAudio not directly, but via wrappers like GStreamer and Qt, so a source of "application bugs" still exists. To be fair, in GStreamer the problem looks solved: "audiomixer" performs synchronous in-application mixing - just what is needed. But not everyone uses or wants to use GStreamer. So I think that there is some room for improvement in PulseAudio itself. Problem statement: add API functions to PulseAudio that would allow an application to request that its streams are mixed together without showing a separate volume slider for each of them in pavucontrol and similar PulseAudio mixer applications. Contact: ? Required background: C 8. LV2 sink (maybe too simple) Problem statement: LV2 is a successor of LADSPA. Pulgin authors move to the new API, but PulseAudio does not have any way to load these lugins and use them for sound processing. A new virtual sink needs to be written, as well as a GUI (possibly integrated into pavucontrol). Contact: Alexander E. Patrakov Required background: C, GTK+ 9. Dynamic range compression (maybe already solved) Problem statement: some consumer electronics (e.g. the Onkyo TX-NR626 receiver) have a mode in which they reduce the dynamic range of the incoming signal. This is supposed to be used when listening to classical music at night, so that neighbours don't wake up and the quietest passages are still audible. Make this feature available to users of cheap analog speakers, via PulseAudio. Write a GUI for configuring it. This may be already solved by vlevel LADSPA plugin (I have not tried it), but needs GUI integration and heuristics to apply this only to high-latency music sterams from players. And possibly a port is needed for rewind comatibility, but I am not sure here if this is possible at all. Contact: ? Required background: C, GTK+, digital signal processing (?) 10. GUI for module-combine-sink and module-remap-sink Problem statement: the functionality to duplicate sound to several cards or to split one sound card into several virtual cards is currently available only via the configuration file or via pacmd. A GUI way to do the same tasks is needed. Contact: ? Required background: GTK+ P.S. With my current job, I don't have enough time to be a good mentor or even a good contributor. But I am open to job offers that would allow me to either work on PulseAudio from my home in Russia (preferred), or will require relocation to either UK or Ireland. -- Alexander E. Patrakov