Hi Arun, Thanks for your message. I'm going to reply to you first, and then reply to David. Arun Raghavan <arun.raghavan at collabora.co.uk> writes: > Having means of doing non-PCM streaming would definitely be desirable. > That said, though, I'm a bit wary of introducing codec and RTP > dependencies in PulseAudio (currently we have this at only one point, > which is the Bluetooth modules - something I don't see a way around). > > Now there are two main concerns: > > 1. Codecs: choosing a codec is not simple. There are always potential > reasons to look at one vs. the other - CPU utilisation, bandwidth, codec > latency, quality, specific implementation (libav vs. reference > implementation vs. hardware acceleration) and so on. Indeed. I think a codec-agnostic implementation is important, as I mentioned in an earlier message. I chose Opus for the same reasons as it was designed for, and I appreciate the desire for some lossless codec too. But I hadn't spent much time thinking about implementation differences, mainly because at the time I did my research, there was only one implementation! Your ideas about using GStreamer are interesting, but I don't know much about how a GStreamer pipeline would fit into PulseAudio's pipeline, and how this would affect things like latency. Nonetheless, you're certainly right about the maintenance burden, and I am a fan of GStreamer in any case. To have its power in PulseAudio would be very interesting, and certainly worth the research I would need to do to understand (say) the effect on latency. > 2. RTP: as our usage gets more complicated, we're going to end up > implementing and maintaining a non-trivial RTP stack, which is actually > quite hard. > > Deciding where to draw the line with regards to what does and does not > belong in PulseAudio is a bit tricky, but in my mind, encoding/decoding > should very much not be in PulseAudio because that beast inevitably gets > more complicated as you try to do more, and there are others solving the > problem system-wide. > > RTP, I can see a case for it being in PulseAudio, but it is also > complicated, and as with codecs, there are other places in the system > where it gets more attention and maintenance. David asked in his other messages about whether my plans were for RTP or for the native protocol. I'll describe my thinking here, too. Originally, I had looked at writing code for implementing these codecs in the native protocol, because I felt that I wanted to build the support as closely to the core of PulseAudio as possible; doing so seemed to be the easiest way to keep on top of the logic for adjusting bitrate etc according to latency and frame drops, and I also felt that doing so in the proprietary protocol would give us more coding liberty. Now, however, I am largely in agreement that it is probably foolish to maintain this whole stack in PulseAudio, and furthermore, it's probably good to use the streaming protocols that are widely known and understood, rather than duplicating the effort in a way that is likely to be suboptimal. Do you agree that using RTP rather than building the GStreamer pipeline closer to the core of PulseAudio is probably the best plan? > The simplest idea I can think of to deal with this meaningfully is to > wrap a sink/source around a GStreamer pipeline to offload all that work > that we don't want to duplicate in PulseAudio . Quite. > On the sink side, we'd hook up to an appsrc to feed PCM data to a > pipeline. The pipeline would take care of encoding, RTP packetisation > and possibly a corresponding RTCP stream. This would allow codec > selection to be flexible, and in the distant future, could even support > taking encoded data directly. > > On the source side, we'd hook up to an appsink, receiving PCM data from > the pipeline. The pipeline would take care decoding whatever the format > is, take care of RTCP and maybe more advanced features such as a jitter > buffer and packet-loss concealment (all of this can be plugged in or > not, depending on configuration). This does sound like a good plan to solve the given problem, save for my concerns about monitoring and integration above. My other thought, which would not work in the context of using an appsrc/appsink, is that having GStreamer built more closely into PulseAudio would also mean we could offload things like resampling (which I noticed was discussed in another GSoC-related thread) to the GStreamer pipeline, rather than have to maintain those code-paths, too. > Doing it this way means you're using a better RTP stack that gets > attention from a number of other use cases (plus assorted related > goodies) and support for multiple codecs. Right, indeed. But I'm still not entirely sure about RTP vs native, and where the GStreamer code should go. My temptation is to couple PulseAudio as closely as possible to GStreamer. But I thought about this a little when doing my initial research, and came to the conclusion that if this was a good idea, it would have been done already. What am I missing? Indeed, if we build GStreamer into pulsecore, then we could use RTP or the native protocol, as we saw fit. If we went for the appsrc/sink solution, then we'd be less flexible. I haven't read the code in depth in a while, and I have exams soon. I cannot start my research properly until they are out of the way in a month or so, so I don't know how plausible this is. But it sounds quite fun. Cheers, Toby