On 20.11.2015 16:18, Tanu Kaskinen wrote: > On Fri, 2015-11-20 at 08:03 +0100, Georg Chini wrote: >> On 20.11.2015 01:03, Tanu Kaskinen wrote: >>> On Wed, 2015-02-25 at 19:43 +0100, Georg Chini wrote: >>>> as well as the way of reacting to sink input or source output moving. >>>> >>>> The goal is to make sure that the initial latency of the system matches >>>> the configured one. >>>> >>>> While at it, allow to set adjust_time to 0, which means "no adjustments >>>> at all". >>>> --- >>>> src/modules/module-loopback.c | 329 +++++++++++++++++++++++++++++++++++------- >>>> 1 file changed, 275 insertions(+), 54 deletions(-) >>>> >>>> diff --git a/src/modules/module-loopback.c b/src/modules/module-loopback.c >>>> index 79bd106..09f2f58 100644 >>>> --- a/src/modules/module-loopback.c >>>> +++ b/src/modules/module-loopback.c >>>> @@ -59,7 +59,9 @@ PA_MODULE_USAGE( >>>> >>>> #define DEFAULT_LATENCY_MSEC 200 >>>> >>>> -#define MEMBLOCKQ_MAXLENGTH (1024*1024*16) >>>> +#define MEMBLOCKQ_MAXLENGTH (1024*1024*32) >>> Why is this change done? Whatever the reason is, shouldn't this be done >>> in a separate patch? >> The queue is just too small if you want to run long delays (30s) together >> with high sample rates (190000 Hz). I can put it in a separate patch. >> >>>> + >>>> +#define DEFAULT_BUFFER_MARGIN_MSEC 20 >>>> >>>> #define DEFAULT_ADJUST_TIME_USEC (10*PA_USEC_PER_SEC) >>>> >>>> @@ -80,11 +82,21 @@ struct userdata { >>>> >>>> int64_t recv_counter; >>>> int64_t send_counter; >>>> + uint32_t sink_adjust_counter; >>>> + uint32_t source_adjust_counter; >>>> >>>> - size_t skip; >>>> pa_usec_t latency; >>>> + pa_usec_t buffer_latency; >>>> + pa_usec_t initial_buffer_latency; >>>> + pa_usec_t configured_sink_latency; >>>> + pa_usec_t configured_source_latency; >>> It would be nice to have comments about what each of these different >>> latency variables mean. >>> >>> I'm trying to understand what buffer_latency is... I suppose it's used >>> to configure the buffering so that the sink, source and buffer_latency >>> together equal u->latency (i.e. the user-configured loopback latency). >> Yes, the buffer latency is the part of the latency that is under direct >> control of the module (mainly what is in the memblockq). It is the >> parameter that is adjusted during the regulation process. > Correct me if I'm wrong, but the parameter that is adjusted is the sink > input rate. The regulation process doesn't touch the buffer_latency > variable. The sink input rate affects the amount that is buffered in > the memblockq, but the buffer_latency variable stays the same (except > when moving the streams). You are right, I was confusing the variable buffer_latency with "the latency that is caused by the current number of samples in the buffer". It's too long ago that I wrote this patch ... buffer_latency is the minimum amount of audio that must be kept in the buffer to avoid underruns. > >>> It's 1/4 of u->latency by default, so I guess you try to configure the >>> sink and source latencies to be 3/8 of u->latency each? buffer_latency >>> can't be less than 1.667 ms, however. (Where does that number come >>> from?) >>> >>> Hmm... When configuring the sink and source latencies, you divide u- >>>> latency by 3, which is consistent with the old code, so maybe it was a >>> typo to divide u->latency by 4 when setting the default buffer_latency? >>> I'll assume it was a typo, and you meant all three latency components >>> to be one third of u->latency by default. >> No, it is no typo. The intention was to set the initial buffer latency >> to something >> smaller than one third. The value is fixed up later. > So does the initial value have any effect on anything? It seems very > weird that buffer_latency defaults to 1/4 of the full latency and sink > and source latencies are 1/3. Taking a closer look today I cannot find any reason why I set it to a quarter of the latency. It should be set to something like "requested latency - expected source/sink latency - some msec" because I use it later in the regulation if there are too many underruns. This is currently only done correctly if you request a latency that cannot be satisfied. I will fix it in the next version. > >>> buffer_latency can't be less than 3/4 of the sink or source latency >>> (where does that 3/4 come from?), so if the sink or source latency >>> exceeds that limit, buffer_latency is raised accordingly. That's with >> Like many other values in this patch the 3/4 is the result of lots of >> experiments >> with the module. If it is less than 3/4 of sink or source latency you >> will start >> seeing underruns. You can go below that value in certain situations, it's a >> safety margin. > I think it should be possible to reason about the minimum safe value. > We want to avoid underruns, but we also want to minimize the latency. > I'm afraid experimental values will fail in cases that you haven't > tested. > > I can't intuitively figure out what the minimum safe value is, but > maybe some examples will show. > > Let's start with zero buffer_latency and 1 second sink and source > latencies. Let's assume that both device buffers are full in the > beginning, and that the sink requests for more data when its buffer is > empty, and the source pushes out data when its buffer becomes full. > Let's also assume that all operations are instantaneous. The sink and > source run at the same rate. > > time = 0 s > > buffered: > source: 1 s memblockq: 0 s sink: 1 s > > The source is full, so the audio moves to the memblockq. > > source: 0 s memblockq: 1 s sink: 1 s > > ---- > > time = 1 s > > The source is again full, so it will push data to the memblockq. The > sink is empty, so it will pull data from the memblockq. These two > things can be serialized in two ways. Either the source pushes first: > > source: 0 s memblockq: 2 s sink: 0 s > source: 0 s memblockq: 1 s sink: 1 s > > or the sink pulls first: > > source: 1 s memblockq: 0 s sink: 1 s > source: 0 s memblockq: 1 s sink: 1 s > > In neither case underruns happened. > > We're back in the same situation as in the beginning, so in this > example buffer_latency can safely be zero. > > > What if we have different latencies in the sink and source? Let's set > source latency to 2 s and sink latency to 3 s. Let's keep other > parameters and assumptions the same. In the beginning both devices have > full buffers again: > > time = 0 s > > source: 2 s memblockq: 0 s sink: 3 s > > The source is full, so it pushes. > > source: 0 s memblockq: 2 s sink: 3 s > > ---- > > time = 2 s > > The source is full, so it pushes. > > source: 0 s memblockq: 4 s sink: 1 s > > ---- > > time = 3 s > > The sink is empty, so it pulls. > > source: 1 s memblockq: 1 s sink: 3 s > > ---- > > time = 4 s > > The source is full, so it pushes. > > source: 0 s memblockq: 3 s sink: 2 s > > ---- > > time = 6 s > > The source is full and the sink is empty, so they push and pull. > > source: 0 s memblockq: 5 s sink: 0 s > source: 0 s memblockq: 2 s sink: 3 s > > or > > source: 2 s memblockq: 0 s sink: 3 s > source: 0 s memblockq: 2 s sink: 3 s > > We're now back to where we started. Again, no underruns. To me it seems > that buffer_latency doesn't need to be big. Underruns happen when the > memblockq doesn't have enough data to satisfy the pull from the sink. > In these two examples the memblockq got empty in some cases, but those > were cases where the sink had just been satisfied and the source was > just about to push more data to the queue, so in some sense an underrun > was not even close. > > In the cases where the memblockq got empty, a rate mismatch could have > caused an underrun: if the source is just about to get full, and the > sink consumes a bit too much data from the memblockq, a small underrun > will occur. To mitigate that, a small margin is needed, which I suppose > should be proportional to the rate error and the sink buffer size (or > max_request to be more accurate). Does the source buffer size play any > role? I'm not sure. Anyway, 3/4 of the sink buffer seems excessive, > because the rate error won't be anywhere close to 75%. > > Then there are scheduling delays and other overhead that break the > assumption of instantaneous operations. Those have very little effect > with big latencies, but with very low latencies they become more > important, and your 3/4 rule isn't necessary excessive. > > If buffer_latency only needs to cover the rate errors and cpu time, > then its proportion of the whole latency should be big with low > latencies and small with high latencies. Does this theory match with > your experiments? The point is, that the assumption that source_output and sink_input rate are the same is not valid. As long as they are, you will not hit a problem. Once you are in a steady state you only have to care about jitter. I cannot clearly remember how I derived that value, probably experiments, but I still believe that 0.75 is a "good" estimate. If you look at the 4 msec case, the buffer_latency is slightly lower than 3/4 of the sink latency (1.667 versus 1.75 ms) but this is also already slightly unstable. In a more general case the resulting latency will be 1.75 * minimum_sink_latency, which I would consider small enough. Regarding your concern that we want to keep latency down: The old module loopback starts to throw errors in the log when you go down to 7 msec latency on Intel HDA. The new module will run happily at 4 msec, so it is still an improvement. With the controller I am currently developing together with Alexander it might even be possible to go further down because it is better at keeping the latency constant. >>> dynamic latency support. If the sink or source doesn't support dynamic >>> latency, buffer_latency is raised to default_fragment_size_msec + 20 >>> ms. I don't think it makes sense to use default_fragment_size_msec. >>> That variable is not guaranteed to have any relation to the sink/source >>> behaviour. Something derived from max_request for sinks would probably >>> be appropriate. I'm not sure about sources, maybe the fixed_latency >>> variable could be used. >> Mh, that is also a value derived from experiments and a safety margin to >> avoid underruns. I can say that for USB cards there is a clear connection >> between default_fragment_size_msec and the necessary buffer_latency >> to avoid underruns. Don't know if max_request is somehow connected >> to default_fragment_size. > Yes, max_request is connected to default_fragment_size. For the alsa > sink, max_request is the same as the configured latency, which > default_fragment_size affects. To me it looks like max_request and fixed latency are derived from default_fragment_size * default_fragments. The probability of underruns only depends on default_fragment_size, the number of fragments is irrelevant. The question is how large the chunks are that are transferred between memblockq and source/sink (or between soundcard and pulse?). So neither max_request nor fixed latency seem to be the right indicators. I remember that it took me quite a while to figure out which values are safe for batch and non-batch cards. My tests have been done with USB, Intel HDA and bluetooth. >>> I'm going to sleep now. I can continue reviewing this patch tomorrow, >>> or I can do it when you send v2, if you split this into more manageable >>> pieces (I'd prefer the latter). Which one do you think is better? >>> >> I'll split it in the next version, but I would appreciate if you can at >> least check >> the rest of the patch if you find something that is obviously wrong on first >> inspection. > Ok, I'll continue with this patch still. I said I'd do it today, but it > will have to wait until tomorrow, sorry. > > -- > Tanu