On 09.04.2016 03:29, Raymond Yau wrote: > > > 2016-4-9 ä¸?å??2:23æ?¼ "Georg Chini" <georg at chini.tk > <mailto:georg at chini.tk>>寫é??ï¼? > > > > On 08.04.2016 18:01, Tanu Kaskinen wrote: > > > >>>>>> > >>>>>> I can't follow that line of reasoning. In the beginning the > ring buffer > >>>>>> is filled to max, and once you call snd_pcm_start(), data starts to > >>>>>> move from the ring buffer to other buffers (I'll call the other > buffers > >>>>>> the "not-ring-buffer"). Apparently the driver "sees" the not-ring- > >>>>>> buffer only partially, since it reports a larger latency than > just the > >>>>>> ring buffer fill level, but it still doesn't report the full > latency. > >>>>>> The time between snd_pcm_start() and the point where the > reported delay > >>>>>> does not any more equal the written amount tells the size of the > >>>>>> visible part of the not-ring-buffer - it's the time it took for the > >>>>>> first sample to travel from the ring buffer to the invisible > part of > >>>>>> the not-ring-buffer. I don't understand how the time could say > anything > >>>>>> about the size of the invisible part of the not-ring-buffer. > Your logic > >>>>>> "works" only if the visible and invisible parts happen to be of the > >>>>>> same size. > >>>>>> > >>>>>> You should get the same results by calculating > >>>>>> > >>>>>> adjusted delay = ring buffer fill level + 2 * (reported > delay - ring buffer fill level) > >>>>>> > >>>>>> That formula doesn't make sense, but that's how I understand > your logic > >>>>>> works, with the difference that your fix is based on one > measurement > >>>>>> only, so it's constant over time, while my formula recalculates the > >>>>>> adjustment every time the delay is queried, so the adjustment size > >>>>>> varies somewhat depending on the granularity at which audio > moves to > >>>>>> and from the visible part of the not-ring-buffer. > >>>>>> > >>>>>> In any case, even if your logic actually makes sense and I'm just > >>>>>> misunderstanding something, I don't see why the correction > should be > >>>>>> done in pulseaudio instead of the alsa driver. > >>>>> > >>>>> Well, now I don't understand what you mean. The logic is very > simple: > >>>>> If there is a not reported delay between the time snd_pcm_start() is > >>>>> called and the time when the first sample is delivered to the > DAC, then > >>>>> this delay will persist and become part of the continuous latency. > >>>>> That's all, what causes the delay is completely irrelevant. > >>>> > >>>> The code can't know when the first sample hits the DAC. The delay > >>>> reported by alsa is supposed to tell that, but if the reported > delay is > >>>> wrong, I don't think you have any way to know the real delay. > >>> > >>> Yes, the code can know when the first sample hits the DAC. I > explained it > >>> already. Before the first sample hits the DAC, the delay is > growing and > >>> larger or equal than the number of samples you have written to the > >>> buffer. > >>> At the moment the delay is smaller than the write count, you can be > >>> sure that at least some audio has been delivered. Since the delay is > >>> decreased by the amount of audio that has been delivered to the DAC, > >>> you can work back in time to the moment when the first sample has been > >>> played. > >> > >> Yes, you explained that already, but you didn't give a convincing > >> explanation of why the point in time when the delay stops growing would > >> indicate the point when the first sample hit the DAC. > > > > > > See below. The precondition for my thoughts naturally is that no > > samples vanish from the latency reports, maybe that is where > > we are thinking differently. > > > > > >> > >>>>> Maybe what I said above was not complete. At the point in time when > >>>>> the first audio is played, there are two delays: First the one > that is > >>>>> reported > >>>>> by alsa and the other is the difference between the time stamps > minus > >>>>> the played audio. If these two delays don't match, then there is an > >>>>> "extra delay" that has to be taken into account. > >>>> > >>>> The difference between the time stamps is not related to how > big the > >>>> invisible part of the buffer is. I'll try to illustrate: > >>>> > >>>> In the beginning, pulseaudio has written 10 ms of audio to the ring > >>>> buffer, and snd_pcm_start() hasn't been called: > >>>> > >>>> DAC <- ssssssssss|sss|dddddddddd <- pulseaudio > >>>> > >>>> Here "ssssssssss|sss|ddddddddd" is the whole buffer between the > DAC and > >>>> pulseaudio. It's divided into three parts; the pipe characters > separate > >>>> the different parts. Each letter represents 1 ms of data. "s" stands > >>>> for silence and "d" stands for data. The first part of the buffer is > >>>> the invisible part that is not included in the delay reports. > I've put > >>>> 10 ms of data there, but it's unknown to the driver how big the > >>>> invisible part is. The middle part of the buffer is the "send buffer" > >>>> that the driver maintains, its size is 3 ms in this example. It's > >>>> filled with silence in the beginning. The third part is the ring > >>>> buffer, containing 10 ms of data from pulseaudio. > >>>> > >>>> At this point the driver reports 10 ms latency. It knows it has 3 > ms of > >>>> silence buffered too, which it should include in its latency report, > >>>> but it's stupid, so it only reports the data in the ring buffer. The > >>>> driver has no idea how big the invisible part is, so it doesn't > include > >>>> it in the report. > >>>> > >>>> Now pulseaudio calls snd_pcm_start(), which causes data to start > moving > >>>> from the ring buffer to the send buffer. After 1 ms the situation > looks > >>>> like this: > >>>> > >>>> DAC <- ssssssssss|ssd|ddddddddd <- pulseaudio > >>>> > >>>> There's 2 ms of silence in the send buffer and 1 ms of data. The > driver > >>>> again ignores the silence in the send buffer, and reports that the > >>>> delay is 10 ms, which consists of 1 ms of data in the send buffer > and 9 > >>>> ms of data in the ring buffer. > >>>> > >>>> After 2 ms: > >>>> > >>>> DAC <- ssssssssss|sdd|dddddddd <- pulseaudio > >>>> > >>>> Reported delay: 10 ms > >>>> > >>>> After 3 ms: > >>>> > >>>> DAC <- ssssssssss|ddd|ddddddd <- pulseaudio > >>>> > >>>> Reported delay: 10 ms > >>>> > >>>> Let's say pulseaudio refills the ring buffer now. > >>>> > >>>> DAC <- ssssssssss|ddd|dddddddddd <- pulseaudio > >>>> > >>>> Reported delay: 13 ms > >>>> > >>>> After 4 ms: > >>>> > >>>> DAC <- sssssssssd|ddd|ddddddddd <- pulseaudio > >>>> > >>>> The first data chunk has now entered the invisible part of the > buffer, > >>>> but it will still take 9 ms before it hits the DAC. At this point > >>>> pulseaudio has written 13 ms of audio, and the reported delay is > 12 ms. > >>>> According to your logic, the adjusted delay is 12 + (4 - 1) = 15 ms, > >>>> while in reality the latency is 22 ms. > >>> > >>> At this point, no audio has been played yet. You still have > silence in the > >>> buffer, so alsa would not report back, that samples have been played. > >> > >> But the reported delay stopped growing! That's the point where you > >> claim the first sample hits the DAC, but as my example illustrates, > >> that doesn't seem to be true. > > > > > > In your example it is not true, that's right. But for the USB > devices it is. > > They only start decreasing the delay when real audio has been played, > > and they would increase the delay when you write to the buffer, > > I have checked that in the code. > > And I think any driver that makes samples vanish is so severely screwed, > > that we can't do anything about it. If the driver reports complete > moonshine > > numbers, you can't fix it, I agree with you in that respect. > > > > But that is not the case with USB. There is only some missing latency > > that is not reported - call it transport delay or whatever and I > suspect a > > similar delay can be found in other alsa drivers. There is no need > to figure > > out the reason for it, it just takes some time after snd_pcm_start() was > > called until the first sample is played - without making samples vanish. > > And in that case the delay can be detected and used by the code. > > > > > >> > >>> I choose the point where the first d hits the DAC and that is reported > >>> back by alsa. (see above) I've tried put it all together in a > document. > >>> I hope I can finish the part that deals with the smoother code today. > >>> If so, I will send it to you privately because the part about > >>> module-loopback > >>> is still missing. > >>> Anyway, even if you think it is wrong I am still measuring the correct > >>> end-to-end latency with my code, so something I am doing must be > >>> right ... > >> > >> >From what I can tell, that's a coincidence. > > > > > > No, it definitely isn't. If you accept the precondition, that samples > > not simply vanish from the latency reports, it's physics. > > I would tend to agree that I have overlooked something, if the "extra > > delay" would be the same every time and if I could not write down > > the math for it. > > But it isn't completely constant (just in the same range) and I can > > write down the math and it matches my measurements. So I am > > fairly sure that I am right. Did you have a look at my document? > > > > > >> > >>>> I don't know how well this model reflects the reality of how the usb > >>>> audio driver works, but this model seems like a plausible explanation > >>>> for why the driver reports delays equalling the amount of written > data > >>>> in the beginning, and why the real latency is higher than the > reported > >>>> latency at later times. > >>>> > >>>> I hope this also clarifies why I don't buy your argument that the > time > >>>> stamp difference is somehow related to the unreported latency. > >>> > >>> No, in fact it doesn't. > >>> > >>>>> Trying to fix up that delay on every iteration does not make any > sense > >>>>> at all, it is there from the start and it is constant. > >>>> > >>>> Commenting on "it is constant": The playback latency is the sum > of data > >>>> in various buffers. The DAC consumes one sample at a time from > the very > >>>> last buffer, but I presume that all other places move data in bigger > >>>> chunks than one sample. The unreported delay can only be constant if > >>>> data moves to the invisible part of the buffering in one sample > chunks. > >>>> Otherwise the latency goes down every time the DAC reads a > sample, and > >>>> then when the buffer is refilled at the other end, the latency > jumps up > >>>> by the refill amount. > >>> > >>> I only said the "extra latency" is constant, not the latency as > such. > >>> See your own example above that your argument is wrong. Even > >>> if the audio is moved in chunks through your invisible buffer part, > >>> that part still has the same length all the time. When one "d" is > >>> moved forward another one will replace it. > >> > >> No, the invisible part is not constant, even though my presentation > >> didn't show the variance. The DAC consumes data from the invisible > >> buffer one sample at a time, and each time it does that, the extra > >> latency decreases by one sample. Data moves from the visible part of > >> the buffer to the invisible part in bigger chunks. I didn't specify the > >> chunk size, but if we assume 1 ms chunks, the extra latency grows by 1 > >> ms every time a chunk is transferred from the visible part to the > >> invisible part. > > > > > > Then take any part of the buffer but the last or the first bit. All the > > chunks are always full, so it's constant. The moving bit is dealt with > > elsewhere, (in the smoother) but there is a lot of buffer that is always > > full. > > And when you take USB, the driver sees only chunks. The sample > > by sample consuming of the DAC is never seen by the driver, it gets > > the notification from USB that a chunk has been played. > > I'm not sure how it is with HDA, but probably similar. > > > >> > >>>>> This is not a negative delay reported by alsa, but my "extra > latency" > >>>>> is getting negative, which means playback must have started > >>>>> before snd_pcm_start(). > >>>>> According to Raymond Yau playback seems in fact to be started > >>>>> before snd_pcm_start() for HDA devices, at least if I read his last > >>>>> mail on that topic right. Then the negative delays would even make > >>>>> sense, since data is written to the buffer before snd_pcm_start(). > >>>> > >>>> I had a look at the code to verify the claim that we configure > alsa to > >>>> start playback already before we call snd_pcm_start(). If we > really do > >>>> that intentionally, then it doesn't make sense to call > snd_pcm_start() > >>>> explicitly. > >>>> > >>>> This is what we do: > >>>> snd_pcm_sw_params_set_start_threshold(pcm, swparams, > (snd_pcm_uframes_t) -1) > >>>> > >>>> Note the casting of -1 to an unsigned integer. It seems that the > >>>> intention is to set as high threshold as possible to avoid automatic > >>>> starting. However, alsa-lib casts the threshold back to a signed > value > >>>> when it's used, and I believe the end result is indeed that playback > >>>> starts immediately after the first write. I don't know if that > matters, > >>>> since we do the manual snd_pcm_start() call immediately after the > first > >>>> write anyway, but it seems like a bug in any case. > >> > >> Not very important, but I'll clarify one thing: I had another look, and > >> I'm not any more sure that the code where I saw the casting back to a > >> signed integer is actually used by pulseaudio. The function > >> is snd_pcm_write_areas(), but pulseaudio doesn't call that at least > >> directly, and I did some searching in alsa-lib too, and I didn't find a > >> call path that would cause snd_pcm_write_areas() to be used by > >> pulseaudio. Even if snd_pcm_write_areas() isn't used, though, it's > >> entirely possible that there's some other code that does a similar > >> cast. I don't know the code is that triggers the snd_pcm_start() call > >> when the ring buffer fill level exceeds the configured threshold. It > >> might be in the kernel. > >> > >>> OK, this it why I measure an "extra latency" of -60 to -20 usec. > >>> So again, if I can measure it and even detect a bug that way, > >>> don't you think there must be some truth in what I'm saying? > >> > >> Do I understand correctly that your "extra latency" is affected by > >> whether snd_pcm_start() is called implicitly in mmap_write() or > >> explicitly after mmap_write()? The time when mmap_write() is called > >> doesn't affect the latency in the long term. > > > > It does. It isn't much, but if playback starts earlier, the delay > > will be exactly that amount less even after 10 hours of playback. > > Let's assume you have 10ms of audio to write to the buffer. > > During the time, when you write, samples are coming in. > > Let's say it takes 100 usec to write the buffer. If you start > > playback after the write, this will be 100 usec additional delay. > > 5 samples have accumulated. > > If you start playback immediately after the first bit of data is > > written this might take much less time, say 20 usec. > > So your delay is four samples less and it will remain that way > > until the sink is stopped. There is nothing that would take away > > the delay. > > > > > >> The smoother will produce > >> wrong values if it's not started at the same time as snd_pcm_start() is > >> called, but I presume the smoother is able to fix such inaccuracies > >> over time, so it doesn't matter that much when the snd_pcm_start() is > >> called. So isn't it a bad thing if your "extra latency" permanently > >> includes something that doesn't have any real effect after some time? > > > > > > Yes, it is affected by it and it should be, because the "extra delay" > > is the time between snd_pcm_start() and the first sample being > > played. So if the first samples are played before snd_pcm_start() > > the "extra latency" will become negative. And as explained above, > > it has permanent effect. Somehow you seem to be of the opinion > > that all delays that are not controlled by the pulseaudio code > > vanish magically, but they don't. > > > > For the reported latency, it just means, that it will become slightly > > smaller. As I said, the smoother does not use the "extra delay" > > for anything, it is only calculated once when the origin for the > > smoother is set and added later as an offset, when get_latency() > > is called. > > > > as your log had two "Starting Playback" message, can you call > snd_pcm_dump after snd_pcm_start to find value of appl_ptr, > I will, but there is a suspend message between the two "Starting Playback" messages: sink.c: Suspending sink alsa_output.usb-0d8c_C-Media_USB_Headphone_Set-00.analog-stereo due to changing the sample rate. sink.c: Suspend cause of sink alsa_output.usb-0d8c_C-Media_USB_Headphone_Set-00.analog-stereo is 0x0020, suspending So I don't think there is a problem, but I will do your test and let you know the results. > do pulseaudio prebuf mean minimum first write ? > Don't know, according to Tanu, the first write will fill the buffer to the configured latency. The log also shows this. Because the buffer of module-loopback is filled when playback is started, buffering should not be a problem. > Do loopback module stop the running pcm stream ? > > Seem pulseaudio does not use snd_pcm_drop nor snd_pcm_drain, how can > the running pcm stream stop? > This is the beginning of the suspend function of module-loopback, so obviously snd_pcm_close close is called instead of snd_pcm_drop or _drain (I did not change anything here): static int suspend(struct userdata *u) { pa_assert(u); pa_assert(u->pcm_handle); /* Let's suspend -- we don't call snd_pcm_drain() here since that might * take awfully long with our long buffer sizes today. */ snd_pcm_close(u->pcm_handle); -------------- next part -------------- An HTML attachment was scrubbed... URL: <https://lists.freedesktop.org/archives/pulseaudio-discuss/attachments/20160409/64b319ab/attachment-0001.html>